Acoustical Science and Technology
Online ISSN : 1347-5177
Print ISSN : 1346-3969
ISSN-L : 0369-4232

この記事には本公開記事があります。本公開記事を参照してください。
引用する場合も本公開記事を引用してください。

Synthesis of everyday conversational speech based on fine-tuning with a corpus for speech synthesis
Hiroki MoriKota Furukawa
著者情報
ジャーナル オープンアクセス 早期公開

論文ID: e24.35

この記事には本公開記事があります。
詳細
抄録

In this letter, we propose a separate modeling of prosodic and segmental features for everyday conversational speech synthesis, addressing challenges posed by low-quality recordings in the Corpus of Everyday Japanese Conversation (CEJC). Initially, the FastSpeech 2 model is trained on the conversation corpus and subsequently fine-tuned on a corpus for speech synthesis. Experimental results show that this fine-tuning approach enhances synthesis quality while preserving the nuances of everyday conversations.

著者関連情報
© 2024 by The Acoustical Society of Japan

This article is licensed under a Creative Commons [Attribution-NoDerivatives 4.0 International] license.
https://creativecommons.org/licenses/by-nd/4.0/
feedback
Top