International Journal of Affective Engineering
Online ISSN : 2187-5413
ISSN-L : 2187-5413
Original Articles
Proposal of a Japanese-speech-synthesis Method with Dimensional Representation of Emotions based on Prosody as well as Voice-quality Conversion
Shoichi TAKEDAYoshiki KABUTATomohiro INOUEMasashi HATOKO
Author information
JOURNAL FREE ACCESS

2013 Volume 12 Issue 2 Pages 79-88

Details
Abstract

This paper proposes a Japanese speech synthesis system that is capable of expressing variable degrees of emotions based on prosody as well as voice-quality conversion. Among voice-quality features, we find that the spectral tilts depend on the type and degree of emotion. Up to date, we have introduced a spectral-tilt conversion rule into our speech-synthesis system. From our previous analyses, we found that the spectral-tilt quantities increased as the degrees of “anger”, “joy”, and “crying-type (hot) sadness” increased. On the other hand, the spectral-tilt quantities were found to decrease as the degree of “dispirited-and-whispering-type (cold) sadness” increased. We formulate a transfer function that converts spectral-tilt quantities of “neutral” speech to those of emotional speech in various degrees. The prosody-conversion rules are also determined based on our previous findings. Informal listening to synthetic-speech samples converted by the proposed method gives us impressions of those similar to natural emotional speech and the differences depending on the degrees of emotions are recognizable.

Content from these authors
© 2013 Japan Society of Kansei Engineering
Previous article Next article
feedback
Top