Host: The Japanese Society for Artificial Intelligence
Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 38
Location : [in Japanese]
Date : May 28, 2024 - May 31, 2024
Motivated by a deep affection for Toyama, the study focuses on speech recognition of the Toyama dialect. Despite an appreciation for its unique language style, it poses communication challenges with individuals from different regions. Therefore, this study aims to develop a system that converts the Toyama dialect into standard Japanese by speech recognition, facilitating communication for visitors from other areas. We employed wav2vec 2.0 for the speech recognition model and used two GPT-2 models for standard Japanese conversion model. We created a Toyama dialect corpus and enhanced its quality via meticulous transcription. All speech data underwent smoothing via RMS and data augmentation through masking during the training. In the experiments, we employed CER and WER as automatic evaluations, and the human evaluations focused on semantic equivalence and grammaticality. Empirical studies demonstrate that our model outperformed the baseline, and the effectiveness of our approach is verified in the discussion.