Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
Viterbi Approximation of Latent Words Language Models for Automatic Speech Recognition
Ryo MasumuraTaichi AsamiTakanobu ObaHirokazu MasatakiSumitaka Sakauchi
Author information
JOURNAL FREE ACCESS

2019 Volume 27 Pages 168-176

Details
Abstract

This paper presents a Viterbi approximation of latent words language models (LWLMs) for automatic speech recognition (ASR). The LWLMs are effective against data sparseness because of their soft-decision clustering structure and Bayesian modeling, so LWLMs can perform robustly in multiple ASR tasks. Unfortunately, implementing an LWLM to ASR is difficult because of its computation complexity. In our previous work, we implemented an n-gram approximation of LWLM for ASR by sampling words according to a stochastic process and training word n-gram LMs. However, the previous approach cannot take into account a latent word sequence behind a recognition hypothesis. Our solution is the Viterbi approximation that simultaneously decodes both the recognition hypothesis and the latent word sequence. The Viterbi approximation is implemented as a two-pass ASR decoding in which the latent word sequence is estimated from a decoded recognition hypothesis using Gibbs sampling. Experiments show the effectiveness of the Viterbi approximation in an n-best rescoring framework. In addition, we investigate the relationship of the n-gram approximation and the Viterbi approximation.

Information related to the author
© 2019 by the Information Processing Society of Japan
Previous article Next article
feedback
Top