The Journal of The Institute of Image Information and Television Engineers
Online ISSN : 1881-6908
Print ISSN : 1342-6907
ISSN-L : 1342-6907
Intelligible High-speed Playback Technology Using the Acoustic Features of Speech Prosody
Atsushi ImaiNaoyuki TazawaYukio IwahanaTohru TakagiNobumasa SeiyamaToshiaki TanakaTohru Ifukube
Author information
Keywords: DAISY
JOURNAL FREE ACCESS

2012 Volume 66 Issue 7 Pages J214-J220

Details
Abstract
We have developed an intelligible high-speed speech rate conversion technology using the acoustic feature quantities that contribute to prosody. In contrast to the conventional method, which plays back accelerated speech at the same uniform rate from the beginning to end, our proposed approach varies the playback rate adaptively on the basis of acoustic detection of the position of an utterance and any fluctuations in a speaker's fundamental frequency (F0) and power. In so doing, we hope to make high-speed playback easier to listen to by providing the listener with a "slowed-down" playback effect. Since this approach converts speech rate using just the acoustic features of audio data, it can be applied to not only Japanese but other languages as well. While the algorithm we developed in this study is optimized for the Japanese language, we aim to implement the proposed approach in a wider array of commercial devices and customize the technology to various languages.
Content from these authors
© 2012 The Institute of Image Information and Television Engineers
Previous article Next article
feedback
Top