IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
An LVCSR Based Reading Miscue Detection System Using Knowledge of Reference and Error Patterns
Changliang LIUFuping PANFengpei GEBin DONGHongbin SUOYonghong YAN
Author information
JOURNAL FREE ACCESS

2009 Volume E92.D Issue 9 Pages 1716-1724

Details
Abstract
This paper describes a reading miscue detection system based on the conventional Large Vocabulary Continuous Speech Recognition (LVCSR) framework [1]. In order to incorporate the knowledge of reference (what the reader ought to read) and some error patterns into the decoding process, two methods are proposed: Dynamic Multiple Pronunciation Incorporation (DMPI) and Dynamic Interpolation of Language Model (DILM). DMPI dynamically adds some pronunciation variations into the search space to predict reading substitutions and insertions. To resolve the conflict between the coverage of error predications and the perplexity of the search space, only the pronunciation variants related to the reference are added. DILM dynamically interpolates the general language model based on the analysis of the reference and so keeps the active paths of decoding relatively near the reference. It makes the recognition more accurate, which further improves the detection performance. At the final stage of detection, an improved dynamic program (DP) is used to align the confusion network (CN) from speech recognition and the reference to generate the detecting result. The experimental results show that the proposed two methods can decrease the Equal Error Rate (EER) by 14% relatively, from 46.4% to 39.8%.
Content from these authors
© 2009 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top