2016 Volume 18 Issue 4 Pages 339-352
Humans communicate emotions by both their faces and voices. The emotions are generally congruent but they sometimes conflict across modalities. We investigated the perception of such emotionally congruent and conflicting expressions through the lens of two-dimensional emotion space (2DES). As experimental stimuli, we audioand video-recorded a male actor's expressions that portrayed four emotions: happy, angry, sad, and relaxed. Participants were exposed to 16 combinations of faces and voices and rated their perceived degree of valence (positive?negative) and activity (high?low), for each of which the reaction time was recorded. Participants also chose what emotion the actor expressed. Results showed the perceptual inconsistency across valence and activity. For the valence, the participants perceived negative emotions when watching a negative face, even with a positive voice. Such perceptual bias between modalities has not been shown for activity ratings. These results suggest that the mechanism that supports human cross-modal processing differs as a function of the dimension of emotion. The future directions of research are discussed through our proposed model of the audio-visual perception of emotions.