Journal of Signal Processing
Online ISSN : 1880-1013
Print ISSN : 1342-6230
ISSN-L : 1342-6230
Robust Speech Recognition with Dynamic Time Warping and Nonlinear Median Filter
Yuxin ZhangYoshikazu MiyanagaConstantin Siriteanu
Author information
JOURNAL FREE ACCESS

2012 Volume 16 Issue 2 Pages 147-157

Details
Abstract
In this paper we propose a new robust automatic speech recognition (ASR) method using dynamic time warping (DTW) and a nonlinear median filter (NMF). Although conventional DTW is fast and requires no training, its recognition accuracy is limited. The recognition accuracy of conventional DTW algorithms is lower than that of algorithms using the hidden Markov model (HMM) approach under all noisy conditions. Therefore, in order to improve ASR accuracy, in this paper we first employ the short-time energy method to remove nonspeech segments. Then, we deploy a noise-reduction method. Finally, unlike conventional DTW algorithms, which search for the reference word with minimum distance from the unknown speech waveform, we use an NMF and search for the reference word with minimum median distance from the unknown speech waveform. We find that the recognition accuracy of conventional DTW implementations can be improved substantially by the NMF. Our approach yields DTW recognition accuracy similar to that of the HMM techniques in the presence of 10 dB and 20 dB white noise, while there is no complicated training required in the proposed DTW with the NMF.
Content from these authors
© 2012 Research Institute of Signal Processing, Japan
Previous article Next article
feedback
Top