Journal of the Acoustical Society of Japan (E)
Online ISSN : 2185-3509
Print ISSN : 0388-2861
ISSN-L : 0388-2861
Speaker normalized spectral subband parameters for noise robust speech recognition
Satoru TsugeToshiaki FukadaHarald SingerKuldip K. Paliwal
ジャーナル フリー

1999 年 20 巻 6 号 p. 425-431


This paper proposes speaker normalized spectral subband centroids (SSCs) as supplementary features in noise environment speech recognition. SSCs are computed as frequency centroids for each subband from the power spectrum of the speech signal. This feature can be obtained reliably even under noisy conditions because SSC are mainly computed from spectral peaks such as formants whose positions are almost unchanged in a noisy environment. Since the conventional SSCs depend on formant frequencies of a speaker, the distributions of SSCs computed from large amounts of speakers will be highly overlapped between different phones. Therefore, we introduce a speaker normalization technique into SSC computation to reduce the speaker variability. Experimental results on spontaneous speech recognition show that the speaker normalized SSCs are more useful as supplementary features for improving the recognition performance than the conventional SSCs. We observed a significant improvement in error rate by 20.3% and 14.3% at SNR=15dB by adding speaker normalized SSCs to the conventional features and by incorporating a speaker normalized technique into the conventional SSCs, respectively.

© The Acoustical Society of Japan
前の記事 次の記事