At the present time, how to extract acoustic features of voiceless stop consonants is oneof the most difficult problems remaining unsolved in the field of automatic speech recognition.This paper describes a study on characteristic features of the Japanese voicelessstop consonants /
p/, /
t/ and /
k/. A multi-dimensional statistical analysis method isapplied to analyze their spectra. Analysis reveals that the principal characteristics discriminatingbetween them are reflected in the accumulated power from about 1kHz upto 5kHz and the existence of a spectral peak. It is also found that these feature parametersmake it possible to separate the voiceless stop consonants from each other. Experimentsof automatic recognition based on the maximum likelihood method are alsoperformed. They are carried out on condition that the following vowel is correctlyknown beforehand and that there are no errors in detection of the noise onset. It isfound that a recognition rate as high as about 97% can be attained for the training dataset of known utterances if the recognition algorithm designed to make the best use ofthe transition patterns of the feature parameters is adopted. It is also shown that theaccurate detection of the moment of noise burst is essential to attain a high recognition rate.
View full abstract