Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Session ID : 1D3-GS-7-04
Conference information

Detection of mispronunciations by speech recognition and mispronunciation candidates
*Arata SAITOTakuya MATUZAKI
Author information
Keywords: AI, Speech recognition
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

We have developed a method for detecting reading errors in Japanese speech data. First, speech recognition is performed to transcribe a speech to the form of a phoneme sequence, and then it is checked whether it includes reading errors. In order to distinguish between errors in speech recognition and actual reading errors, we create a candidate list of reading errors for each morpheme, select the one with the smallest edit distance from the speech recognition result among the correct answer and the candidate reading errors, and detect it as a reading error if it is different from the correct reading. We conducted experiments on speech data in the LaboroTVspeech corpus and the Japanese Spoken Language Corpus, as well as synthetic speech. The results confirmed that the method is effective when the speech actually contains reading errors, although there were many cases in which reading errors were mis-detected even when the correct reading was made. In particular, in experiments with synthesized speech, the method was able to accurately detect misreading in 80.0% of the cases, including how a word was mispronunciated, and succeeded in detecting 98.6% of wrongly pronunciated morphemes.

Content from these authors
© 2024 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top