人工知能学会論文誌
Online ISSN : 1346-8030
Print ISSN : 1346-0714
ISSN-L : 1346-0714
原著論文
対話行為情報を表現可能なDNN音声合成と発語内行為自然性に関する評価
北条 伸克井島 勇祐杉山 弘晃宮崎 昇川西 隆仁柏野 邦夫
著者情報
ジャーナル フリー

2020 年 35 巻 2 号 p. A-J81_1-17

詳細
抄録

This paper aims at improving naturalness of synthesized speech generated by a text-to-speech (TTS) systemwithin a spoken dialogue system with respect to “how natural the system’s intention is perceived via the synthesizedspeech”. We call this measure “illocutionary act naturalness” in this paper. To achieve this aim, we propose toutilize dialogue-act (DA) information as an auxiliary feature for a deep neural network (DNN)-based speech synthesissystem. First, we construct a speech database with DA tags. Second, we build the proposed DNN-based speechsynthesis system based on the database. Then, we evaluate the proposed method by comparing its performance withtwo conventional hidden Markov model (HMM)-based speech synthesis systems, namely, the style-mixed modelingmethod and the style adaptation method. The objective evaluation results show that the proposed method overwhelmsthe style-mixed modeling method in the accuracy of reproduction of global prosodic characteristics of dialogue-acts.They also reveal that the proposed method overwhelms the style adaptation method in the accuracy of reproduction of sentence final tone characteristics of dialogue-acts. The subjective evaluation results also show that the proposed method improves the illocutionary act naturalness compared with the two conventional methods.

著者関連情報
© 人工知能学会 2020
次の記事
feedback
Top