IEICE Electronics Express
Online ISSN : 1349-2543
ISSN-L : 1349-2543
Using neutralized formant frequencies to improve emotional speech recognition
Davood GharavianMansour SheikhanFarhad Ashoftedel
Author information

2011 Volume 8 Issue 14 Pages 1155-1160


Emotion of speech degrades the performance of Automatic Speech Recognition (ASR) systems. With the aim of enhancing the emotional speech recognition accuracy, the effects of formant frequencies and their slopes on improving the performance are investigated in this paper. For this purpose, the formant frequencies are neutralized using hybrid of Dynamic Time Warping (DTW) and Multi-Layer Perceptron (MLP) neural networks. Each one of the neutralized formant frequencies is considered as a supplementary feature and used in Hidden Markov Model (HMM)-based ASR. Experimental results show that by using the slope of neutralized formant frequency features, the recognition rate in happiness and anger states is improved by at most 2.1% and 3.6%, respectively.

Information related to the author
© 2011 by The Institute of Electronics, Information and Communication Engineers
Previous article Next article