Journal of Signal Processing
Online ISSN : 1880-1013
Print ISSN : 1342-6230
ISSN-L : 1342-6230
Continuous Audiovisual Emotion Recognition Using Feature Selection and LSTM
Reda ElbarougyBagus Tris AtmajaMasato Akagi
Author information
JOURNAL FREE ACCESS

2020 Volume 24 Issue 6 Pages 229-235

Details
Abstract

Speech and visual information are the most dominant modalities for a human to perceive emotion. A method of recognizing human emotion from these modalities is proposed by utilizing feature selection and long short-term memory (LSTM) neural networks. A feature selection method based on support vector regression is used to select the relevant features among thousands of features extended from speech and video features via bag-of-X-words. The LSTM neural networks then are trained using a number of selected features and also separately optimized for every emotion dimension. Instead of utterance-level emotion recognition, time-frame-based processing is performed to enable continuous emotion recognition using a database labeled for each time frame. Experimental results reveal that a system with feature selection is more effective for predicting emotion dimensions for a single language than the baseline system without feature selection. The performance is measured in terms of the concordance correlation coefficient obtained by averaging the valence, arousal, and liking dimensions.

Content from these authors
© 2020 Research Institute of Signal Processing, Japan
Next article
feedback
Top