JSAI Technical Report, SIG-SLUD
Online ISSN : 2436-4576
Print ISSN : 0918-5682
97th (Feb, 2023)
Conference information

Consideration of speech conversion methods using MelGAN-VC for second language learners
Mori KIYOTADAMiyoshi YASUO
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Pages 01-04

Details
Abstract

Many videos in various languages are posted on video-sharing websites such as YouTube. Watching the videos is promising to be a listening practice for second language learners. However, many of the videos posted on these websites were not produced as listening materials, and some speakers have distinctive accents and other problems making them difficult for learners to understand. For this reason, learners often adjust the playback speed to an easy-to-listen-to speed for them. This research aims to provide an environment in which learners can adjust the accent of the speaker in the video to be more like that of their mother tongue and make it easier to listen to, in order to enable further effects of scaffolding in combination with speed adjustment. We investigated the use of adversarial generative networks (GANs) and other speech conversion methods for this purpose and conducted experiments using MelGAN-VC to convert speech. As a result, it was confirmed that it is difficult to suppress noise to the extent that it does not bother the learners.

Content from these authors
© 2023 The Japaense Society for Artificial Intelligence
Next article
feedback
Top