IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Entity Knowledge-Guided Image-Text Alignment for Joint Multimodal Aspect-Based Sentiment Analysis
Yan XIANGDi WUYunjia CAIYantuan XIAN
著者情報
ジャーナル フリー 早期公開

論文ID: 2024EDP7313

詳細
抄録

Joint multimodal aspect-based sentiment analysis (JMABSA) aims to extract aspects from multimodal inputs and determine their sentiment polarity. Existing research often faces challenges in effectively aligning aspect features across images and text. To address this, we propose an entity knowledge-guided image-text alignment network that integrates alignment across both modalities, enabling the model to more accurately capture jointly expressed aspect and sentiment information in images and text. Specifically, we introduce an entity class embedding to guide the model in learning entity-related features from text. Additionally, we utilize scene and aspect descriptions in images as entity knowledge, helping the model learn entity-relevant features from visual input. The alignment between entity knowledge in images and the initial text further supports the model in learning consistent aspect and sentiment expressions across modalities. Experimental results on two benchmark datasets demonstrate that our method achieves state-of-the-art performance on two public datasets.

著者関連情報
© 2025 The Institute of Electronics, Information and Communication Engineers
前の記事 次の記事
feedback
Top