Panel-page-aware comic genre understanding

Li, Chengze; Liu, Xueting

Please use this identifier to cite or link to this item: https://repository.cihe.edu.hk/jspui/handle/cihe/4427

DC Field	Value	Language
dc.contributor.author	Li, Chengze	en_US
dc.contributor.author	Liu, Xueting	en_US
dc.contributor.other	Xu, C.	-
dc.contributor.other	Xu, X.	-
dc.contributor.other	Zhao, N.	-
dc.contributor.other	Cai, W.	-
dc.contributor.other	Zhang, H.	-
dc.date.accessioned	2024-03-26T08:13:00Z	-
dc.date.available	2024-03-26T08:13:00Z	-
dc.date.issued	2023	-
dc.identifier.uri	https://repository.cihe.edu.hk/jspui/handle/cihe/4427	-
dc.description.abstract	Using a sequence of discrete still images to tell a story or introduce a process has become a tradition in the field of digital visual media. With the surge in these media and the requirements in downstream tasks, acquiring their main topics or genres in a very short time is urgently needed. As a representative form of the media, comic enjoys a huge boom as it has gone digital. However, different from natural images, comic images are divided by panels, and the images are not visually consistent from page to page. Therefore, existing works tailored for natural images perform poorly in analyzing comics. Considering the identification of comic genres is tied to the overall story plotting, a long-term understanding that makes full use of the semantic interactions between multi-level comic fragments needs to be fully exploited. In this paper, we propose P<sup>2</sup> Comic, a Panel-Page-aware Comic genre classification model, which takes page sequences of comics as the input and produces class-wise probabilities. P<sup>2</sup> Comic utilizes detected panel boxes to extract panel representations and deploys self-attention to construct panel-page understanding, assisted with interdependent classifiers to model label correlation. We develop the first comic dataset for the task of comic genre classification with multi-genre labels. Our approach is proved by experiments to outperform state-of-the-art methods on related tasks. We also validate the extensibility of our network to perform in the multi-modal scenario. Finally, we show the practicability of our approach by giving effective genre prediction results for whole comic books.	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.relation.ispartof	IEEE Transactions on Image Processing	en_US
dc.title	Panel-page-aware comic genre understanding	en_US
dc.type	journal article	en_US
dc.identifier.doi	10.1109/TIP.2023.3270105	-
dc.contributor.affiliation	School of Computing and Information Sciences	en_US
dc.contributor.affiliation	School of Computing and Information Sciences	en_US
dc.relation.issn	1941-0042	en_US
dc.description.volume	32	en_US
dc.description.startpage	2636	en_US
dc.description.endpage	2648	en_US
dc.cihe.affiliated	Yes	-
item.openairetype	journal article	-
item.languageiso639-1	en	-
item.fulltext	No Fulltext	-
item.grantfulltext	none	-
item.cerifentitytype	Publications	-
item.openairecristype	http://purl.org/coar/resource_type/c_6501	-
crisitem.author.dept	Yam Pak Charitable Foundation School of Computing and Information Sciences	-
crisitem.author.dept	Yam Pak Charitable Foundation School of Computing and Information Sciences	-
Appears in Collections:	CIS Publication

Show simple item record

Google Scholar^TM

Check

Google Scholar^TM

Altmetric

Altmetric

Google ScholarTM

Altmetric

Altmetric

Google Scholar^TM