Please use this identifier to cite or link to this item:
https://repository.cihe.edu.hk/jspui/handle/cihe/4713
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Salahudeen, Ridwan | en_US |
dc.contributor.author | Siu, Wan Chi | en_US |
dc.contributor.author | Chan, Anthony Hing-Hung | en_US |
dc.date.accessioned | 2025-05-02T07:49:33Z | - |
dc.date.available | 2025-05-02T07:49:33Z | - |
dc.date.issued | 2024 | - |
dc.identifier.uri | https://repository.cihe.edu.hk/jspui/handle/cihe/4713 | - |
dc.description.abstract | This paper focuses on generating photo-realistic talking face videos by leveraging on semantic facial attributes in a latent space and capturing the talking style from an old video of a speaker. We formulate a process to manipulate facial attributes in the latent space by identifying semantic facial directions. We develop a deep learning pipeline to learn the correlation between the audio and the corresponding video frames from a reference video of a speaker in an aligned latent space. This correlation is used to navigate a static face image into frames of a talking face video, which is moderated by three carefully constructed loss functions, for accurate lip synchronization and photo-realistic video reconstruction. By combining these techniques, we aim to generate high-quality talking face videos that are visually realistic and synchronized with the provided audio input. Our results were evaluated against some state-of-the-art techniques on talking face generation, and we have recorded significant improvements in the image quality of the generated talking face video. | en_US |
dc.language.iso | en | en_US |
dc.publisher | IEEE | en_US |
dc.relation.ispartof | IEEE Transactions on Consumer Electronics | en_US |
dc.title | Photo-realistic talking face generation under latent space manipulation | en_US |
dc.type | journal article | en_US |
dc.identifier.doi | 10.1109/TCE.2024.3516387 | - |
dc.contributor.affiliation | Yam Pak Charitable Foundation School of Computing and Information Sciences | en_US |
dc.contributor.affiliation | Yam Pak Charitable Foundation School of Computing and Information Sciences | en_US |
dc.contributor.affiliation | Yam Pak Charitable Foundation School of Computing and Information Sciences | en_US |
dc.relation.issn | 1558-4127 | en_US |
dc.cihe.affiliated | Yes | - |
item.openairecristype | http://purl.org/coar/resource_type/c_6501 | - |
item.languageiso639-1 | en | - |
item.cerifentitytype | Publications | - |
item.openairetype | journal article | - |
item.fulltext | With Fulltext | - |
item.grantfulltext | open | - |
crisitem.author.dept | Yam Pak Charitable Foundation School of Computing and Information Sciences | - |
crisitem.author.dept | Yam Pak Charitable Foundation School of Computing and Information Sciences | - |
crisitem.author.dept | Yam Pak Charitable Foundation School of Computing and Information Sciences | - |
crisitem.author.orcid | 0000-0001-8280-0367 | - |
crisitem.author.orcid | 0000-0001-7479-0787 | - |
Appears in Collections: | CIS Publication |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
View Online | 89 B | HTML | View/Open |

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.