Please use this identifier to cite or link to this item: https://repository.cihe.edu.hk/jspui/handle/cihe/4713
DC FieldValueLanguage
dc.contributor.authorSalahudeen, Ridwanen_US
dc.contributor.authorSiu, Wan Chien_US
dc.contributor.authorChan, Anthony Hing-Hungen_US
dc.date.accessioned2025-05-02T07:49:33Z-
dc.date.available2025-05-02T07:49:33Z-
dc.date.issued2024-
dc.identifier.urihttps://repository.cihe.edu.hk/jspui/handle/cihe/4713-
dc.description.abstractThis paper focuses on generating photo-realistic talking face videos by leveraging on semantic facial attributes in a latent space and capturing the talking style from an old video of a speaker. We formulate a process to manipulate facial attributes in the latent space by identifying semantic facial directions. We develop a deep learning pipeline to learn the correlation between the audio and the corresponding video frames from a reference video of a speaker in an aligned latent space. This correlation is used to navigate a static face image into frames of a talking face video, which is moderated by three carefully constructed loss functions, for accurate lip synchronization and photo-realistic video reconstruction. By combining these techniques, we aim to generate high-quality talking face videos that are visually realistic and synchronized with the provided audio input. Our results were evaluated against some state-of-the-art techniques on talking face generation, and we have recorded significant improvements in the image quality of the generated talking face video.en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.ispartofIEEE Transactions on Consumer Electronicsen_US
dc.titlePhoto-realistic talking face generation under latent space manipulationen_US
dc.typejournal articleen_US
dc.identifier.doi10.1109/TCE.2024.3516387-
dc.contributor.affiliationYam Pak Charitable Foundation School of Computing and Information Sciencesen_US
dc.contributor.affiliationYam Pak Charitable Foundation School of Computing and Information Sciencesen_US
dc.contributor.affiliationYam Pak Charitable Foundation School of Computing and Information Sciencesen_US
dc.relation.issn1558-4127en_US
dc.cihe.affiliatedYes-
item.openairecristypehttp://purl.org/coar/resource_type/c_6501-
item.languageiso639-1en-
item.cerifentitytypePublications-
item.openairetypejournal article-
item.fulltextWith Fulltext-
item.grantfulltextopen-
crisitem.author.deptYam Pak Charitable Foundation School of Computing and Information Sciences-
crisitem.author.deptYam Pak Charitable Foundation School of Computing and Information Sciences-
crisitem.author.deptYam Pak Charitable Foundation School of Computing and Information Sciences-
crisitem.author.orcid0000-0001-8280-0367-
crisitem.author.orcid0000-0001-7479-0787-
Appears in Collections:CIS Publication
Files in This Item:
File Description SizeFormat
View Online89 BHTMLView/Open
SFX Query Show simple item record

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.