多媒体与智能计算实验室

文章列表

Tengpeng Li, Hanli Wang, Bin He, and Chang Wen Chen, Knowledge-enriched Attention Network with Group-wise Semantic for Visual Storytelling, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 7, pp. 8634-8645, Jul. 2023. [Project: click here]
Shanshan Du, Hanli Wang, Tengpeng Li, and Chang Wen Chen, Hybrid Graph Reasoning with Dynamic Interaction for Visual Dialog, IEEE Transactions on Multimedia, vol. 26, pp. 9095-9108, 2024. [Project: click here]
Jian Zhu, Hanli Wang, and Miaojing Shi, Multi-modal Large Language Model Enhanced Pseudo 3D Perception Framework for Visual Commonsense Reasoning, IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 11, pp. 11682-11694, Nov. 2024. [Project: click here]
Tengpeng Li, Hanli Wang, Qinyu Li, and Zhangkai Ni, Vision-Language Relational Transformer for Video-to-Text Generation, IEEE Transactions on Multimedia, vol. 27, pp. 4584-4596, Jul. 2025. [Project: click here]