抄録
Nonlinguistic information such as facial expression, gesture, and tone of voice plays an essential rule in communication between humans. However, most human-machine interfaces are not designed to utilize nonlinguistic information. In order to develop a natural human-machine interface, this paper presents a system for discriminating positive-negative attitudes from nonlinguistic information. Our system utilizes prosody information and natural facial expression observed in a dialogue. For example, the questioner asks a subject to do something, e.g. “Won't you go shopping with me?”. And then, the subject shows an ambiguous response, e.g. “Umm, shopping...”. The proposed system analyzes the prosody information and facial expression obtained from the subject. The system combines optical flows and prosody information to increase the accuracy in the discrimination. Evaluation experiments using 678 videos obtained from 23 subjects have shown that our system achieved 81.4% accuracy, which is 4.8% higher in the case when only optical flows were used as training data and 13.8% higher in the case when only prosody information was used as training data.