Abstract
Emotion recognition becomes an increasingly important research to understand
human's physical state for human-computer interfaces. Though the speech data
could be obtained through communication between a human and a computer as monitoring
operation words or general talk with other person, traditional emotion recognition
techniques could not be applied easily because they needs utterances extracted
as a single word. In order to extract each words from natural speech, another
technique such as morphological analysis must be prepared which depend on user
speech property. As emotion recognition be established without the properties of a
certain person, natural speech is divided into not word-term but short-term utterance.
When the length of utterance becomes shorter, prosodic features from the utterances
shows the similarities, so that each emotion label can not be classified based on
sloppy division technique.