Journal of Signal Processing
Online ISSN : 1880-1013
Print ISSN : 1342-6230
ISSN-L : 1342-6230
Robust Speech Recognition with Dynamic Time Warping and Nonlinear Median Filter
Yuxin ZhangYoshikazu MiyanagaConstantin Siriteanu
著者情報
ジャーナル フリー

2012 年 16 巻 2 号 p. 147-157

詳細
抄録
In this paper we propose a new robust automatic speech recognition (ASR) method using dynamic time warping (DTW) and a nonlinear median filter (NMF). Although conventional DTW is fast and requires no training, its recognition accuracy is limited. The recognition accuracy of conventional DTW algorithms is lower than that of algorithms using the hidden Markov model (HMM) approach under all noisy conditions. Therefore, in order to improve ASR accuracy, in this paper we first employ the short-time energy method to remove nonspeech segments. Then, we deploy a noise-reduction method. Finally, unlike conventional DTW algorithms, which search for the reference word with minimum distance from the unknown speech waveform, we use an NMF and search for the reference word with minimum median distance from the unknown speech waveform. We find that the recognition accuracy of conventional DTW implementations can be improved substantially by the NMF. Our approach yields DTW recognition accuracy similar to that of the HMM techniques in the presence of 10 dB and 20 dB white noise, while there is no complicated training required in the proposed DTW with the NMF.
著者関連情報
© 2012 Research Institute of Signal Processing, Japan
前の記事 次の記事
feedback
Top