Abstract
Many Japanese tend to be embarrassed to talk to agents such as virtual assistants. This problem seems to be caused by their low social presence. Social presence refers to the degree to which one feels human-like properties from an agent. We assumed that their poor emotional expression may impair their humanness. Firstly, this study verified that adding facial expressions to flat synthetic speech could convey an agent’s emotion even if the agent’s speech can be interpreted in two ways, a positive or negative episode. As a result, the human-likeness of the agent tended to improve. This study also found that music acts as an emotional expression for an agent. Adding BGM (Background Music) and SE (Sound Effect) to a flat synthetic voice conveyed an emotion of the agent and made the agent more human and easier to talk to. Furthermore, BGM and SE could produce these effects even when added to emotional synthetic speech.