Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
Viterbi Approximation of Latent Words Language Models for Automatic Speech Recognition
Ryo MasumuraTaichi AsamiTakanobu ObaHirokazu MasatakiSumitaka Sakauchi
著者情報
ジャーナル フリー

2019 年 27 巻 p. 168-176

詳細
抄録

This paper presents a Viterbi approximation of latent words language models (LWLMs) for automatic speech recognition (ASR). The LWLMs are effective against data sparseness because of their soft-decision clustering structure and Bayesian modeling, so LWLMs can perform robustly in multiple ASR tasks. Unfortunately, implementing an LWLM to ASR is difficult because of its computation complexity. In our previous work, we implemented an n-gram approximation of LWLM for ASR by sampling words according to a stochastic process and training word n-gram LMs. However, the previous approach cannot take into account a latent word sequence behind a recognition hypothesis. Our solution is the Viterbi approximation that simultaneously decodes both the recognition hypothesis and the latent word sequence. The Viterbi approximation is implemented as a two-pass ASR decoding in which the latent word sequence is estimated from a decoded recognition hypothesis using Gibbs sampling. Experiments show the effectiveness of the Viterbi approximation in an n-best rescoring framework. In addition, we investigate the relationship of the n-gram approximation and the Viterbi approximation.

著者関連情報
© 2019 by the Information Processing Society of Japan
前の記事 次の記事
feedback
Top