Journal of Signal Processing
Online ISSN : 1880-1013
Print ISSN : 1342-6230
ISSN-L : 1342-6230
Continuous Audiovisual Emotion Recognition Using Feature Selection and LSTM
Reda ElbarougyBagus Tris AtmajaMasato Akagi
著者情報
ジャーナル フリー

2020 年 24 巻 6 号 p. 229-235

詳細
抄録

Speech and visual information are the most dominant modalities for a human to perceive emotion. A method of recognizing human emotion from these modalities is proposed by utilizing feature selection and long short-term memory (LSTM) neural networks. A feature selection method based on support vector regression is used to select the relevant features among thousands of features extended from speech and video features via bag-of-X-words. The LSTM neural networks then are trained using a number of selected features and also separately optimized for every emotion dimension. Instead of utterance-level emotion recognition, time-frame-based processing is performed to enable continuous emotion recognition using a database labeled for each time frame. Experimental results reveal that a system with feature selection is more effective for predicting emotion dimensions for a single language than the baseline system without feature selection. The performance is measured in terms of the concordance correlation coefficient obtained by averaging the valence, arousal, and liking dimensions.

著者関連情報
© 2020 Research Institute of Signal Processing, Japan
次の記事
feedback
Top