IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Improving the Readability of ASR Results for Lectures Using Multiple Hypotheses and Sentence-Level Knowledge
Yasuhisa FUJIIKazumasa YAMAMOTOSeiichi NAKAGAWA
著者情報
ジャーナル フリー

2012 年 E95.D 巻 4 号 p. 1101-1111

詳細
抄録

This paper presents a novel method for improving the readability of automatic speech recognition (ASR) results for classroom lectures. Because speech in a classroom is spontaneous and contains many ill-formed utterances with various disfluencies, the ASR result should be edited to improve the readability before presenting it to users, by applying some operations such as removing disfluencies, determining sentence boundaries, inserting punctuation marks and repairing dropped words. Owing to the presence of many kinds of domain-dependent words and casual styles, even state-of-the-art recognizers can only achieve a 30-50% word error rate for speech in classroom lectures. Therefore, a method for improving the readability of ASR results is needed to make it robust to recognition errors. We can use multiple hypotheses instead of the single-best hypothesis as a method to achieve a robust response to recognition errors. However, if the multiple hypotheses are represented by a lattice (or a confusion network), it is difficult to utilize sentence-level knowledge, such as chunking and dependency parsing, which are imperative for determining the discourse structure and therefore imperative for improving readability. In this paper, we propose a novel algorithm that infers clean, readable transcripts from spontaneous multiple hypotheses represented by a confusion network while integrating sentence-level knowledge. Automatic and manual evaluations showed that using multiple hypotheses and sentence-level knowledge is effective to improve the readability of ASR results, while preserving the understandability.

著者関連情報
© 2012 The Institute of Electronics, Information and Communication Engineers
前の記事 次の記事
feedback
Top