Acoustic feature analysis and optimization for Bangla speech emotion recognition

Sadia Sultana; Mohammad Shahidur Rahman

doi:10.1250/ast.44.157

Abstract

To better understand human behavior, it is essential to investigate the speech features that contribute the most to emotional expressions. In this study, we investigated how different emotions affect the acoustic properties of speech. This study explored a new set of widely utilized acoustic features to recognize emotions from audios. Experimental investigation using the Bangla and English emotional datasets were conducted using SVM, Random forest, and XGBoost algorithm. We used the Grid Search method with five-fold cross-validation to select the optimal parameters for obtaining the best results from the models. Again a five-fold cross-validation was applied to evaluate the models' effectiveness in emotion perception. The XGBoost analysis was employed to calculate the feature importance of speech emotion identification from the datasets. We found that selecting the most important features allows a high level of accuracy in using ML models that is competitive with deep learning models' performance while utilizing less computational complexity.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!