Abstract
This paper proposes a method for emotion recognition in unintentional speech.
Speech data labeled to certain one consists of several kind of emotion utterance.
Even though speech data is assigned to same label, corresponding prosodic features are dissimilar with each other. So, it is hard to distinguish one class to the others. We assume the reason that several subclasses are distributed and overlapped in the feature space. In this paper, we propose the technique to improve recognition rate by detecting multiple hidden series from emotional speech. Emotional speech data with the same label are divided into multiple hidden subclasses according to prosodic feature by k-means. A set of similar hidden subclasses with various label are grouped. Grouped speech data in one hidden emotion category train one SVM, so that several SVMs for several hidden emotion categories. Experimental results show that proposed technique raised recognition rate better than the traditional one.