A novel spotting-based approach to continuous speech recognition: Minimum error classification of keyword-sequences

Takashi Komori; Shigeru Katagiri

doi:10.1250/ast.16.147

抄録

To overcome the lack of theoretical basis of a fundamental, word spotting-based approach to the recognition of natural, spontaneous speech utterances, we propose in this paper a novel spotter (spotting system) design method referred to as Minimum Error Classification of Keyword-sequences (MECK). A key concept of the method is to formalize the entire spotting process as a trainable functional form with the design objective being the keyword-sequence (a string of prescribed keyword categories) classification accuracy. A resulting MECK procedure allows one to design spotters in an efficient way of using only pairs of utterances and their corresponding phonemic transcriptions (not requiring hand-segmented labels) as well as in a mathematically-proven way consistent with the error minimization of the keyword-sequence classification. MECK is quite general and can be applied to any reasonable spotter structure. The paper specially presents implementation details for a prototype-based spotter and demonstrates the utility of this MECK-trained spotter in several Japanese keyword spotting tasks.

著者関連情報

お気に入り & アラート

閲覧履歴

後続誌

Acoustical Science and Technology

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）