顔表情による感情を伴う音声生成

榎木 淳; 鈴木 隆志

doi:10.11517/pjsai.JSAI2022.0_2O1GS704

36th (2022)

Session ID : 2O1-GS-7-04

DOI https://doi.org/10.11517/pjsai.JSAI2022.0_2O1GS704

Conference information

Host: The Japanese Society for Artificial Intelligence

Name : The 36th Annual Conference of the Japanese Society for Artificial Intelligence

Number : 36

Location : [in Japanese]

Date : June 14, 2022 - June 17, 2022

Emotionally Conditioned TTS with Facial Expression

*Jun ENOKI, Takayuki SUZUKI

Author information

Keywords: text to speech, Arts and entertainment applications

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

In these days, generated speech by the modern TTS system can be undistinguishable with real human's speech, and many researches have been studied even on emotionally conditioned TTS. Here we explore another way to control emotions on speech synthesis by combining facial expression data to achieve intuitive conditioning. In this paper, we share our experimental results and discuss the details.

Corresponding author

Conference information

Register with J-STAGE for free!