Dynamic Time Warping for Speech Recognition with Training Part to Reduce the Computation

Xihao Sun; Yoshikazu Miyanaga; Baiko Sai

doi:10.2299/jsp.18.89

Abstract

In this paper, we proposed a dynamic time warping (DTW) method with a training part. DTW is a popular automatic speech recognition (ASR) method based on template matching. Conventional DTW is fast and of low complexity, however its recognition accuracy is limited. Recently, a DTW with multireferences (mDTW) algorithm has also been developed to improve the recognition accuracy to be comparable to that of the hidden Markov model (HMM) algorithm under noisy conditions. However the mDTW algorithm increases the calculation cost. Therefore, in order to reduce the calculation cost, in this paper, a training part will be added to the DTW-based ASR system, unlike the mDTW, which tries to find appropriate reference utterances to replace the increasing utterances. The results show that the average recognition accuracy of the proposed method is similar to that of the mDTW, and the calculation cost was reduced by 41.6%.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!