A new speech recognition system using the neurally-inspired Learning Vector Quantization (LVQ) to train HMM codebooks is described. Both LVQ and HMMs are stochasticalgorithms holding considerable promise for speech recognition. In particular, LVQ is a vector quantizer with very powerful classification ability. HMMs, on the other hand, have the advantage that phone models can easily be concatenated to produce long utterance models, such as word or sentence models. The new algorithm described here combines the advantages inherent in each of these two algorithms. Instead of using a conventional, K-means generated codebook in the HMMs, the new system uses LVQ to adapt the codebook reference vectors so as to minimize the number of errors these referencevectors make when used for nearest neighbor classification of training vectors. The LVQ codebook can then provide the HMMs with high classification power at the phonemiclevel. As the results of phoneme recognition experiments using a large vocabularydatabase of 5, 240 common Japanese words uttered in isolation by a male speaker, it was confirmed that the high discriminant ability of LVQ could be integrated into an HMM architecture easily extendible to longer utterance models, such as word or sentencemodels.
抄録全体を表示