IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Cross-Corpus Speech Emotion Recognition Based on Causal Emotion Information Representation
Hongliang FUQianqian LIHuawei TAOChunhua ZHUYue XIERuxue GUO
著者情報
ジャーナル フリー

2024 年 E107.D 巻 8 号 p. 1097-1100

詳細
抄録

Speech emotion recognition (SER) is a key research technology to realize the third generation of artificial intelligence, which is widely used in human-computer interaction, emotion diagnosis, interpersonal communication and other fields. However, the aliasing of language and semantic information in speech tends to distort the alignment of emotion features, which affects the performance of cross-corpus SER system. This paper proposes a cross-corpus SER model based on causal emotion information representation (CEIR). The model uses the reconstruction loss of the deep autoencoder network and the source domain label information to realize the preliminary separation of causal features. Then, the causal correlation matrix is constructed, and the local maximum mean difference (LMMD) feature alignment technology is combined to make the causal features of different dimensions jointly distributed independent. Finally, the supervised fine-tuning of labeled data is used to achieve effective extraction of causal emotion information. The experimental results show that the average unweighted average recall (UAR) of the proposed algorithm is increased by 3.4% to 7.01% compared with the latest partial algorithms in the field.

著者関連情報
© 2024 The Institute of Electronics, Information and Communication Engineers
前の記事 次の記事
feedback
Top