Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Session ID : 4Xin2-34
Conference information

Standardization for Absorbing Variations in Pause Duration Distribution in Pause Duration Estimation for Reading-Style Speech Synthesis
*Shunji TAKESHITATakuya MATSUZAKI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In storytelling speech, the distribution of pause durations varies due to differences in the text, the reader, and whether the text is spoken lines or not. In this study, we attempted to absorb these differences by standardizing the pause durations in the training data when learning to predict the pause position and pause duration based on the text to be read aloud. We found that standardization within each audiobook was the most effective among several standardization methods.

Content from these authors
© 2024 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top