1998 年 13 巻 4 号 p. 619-630
In this paper, we propose a method to acquire speech procesing units automatically that are suitable for automatic processing, not based on prior units like syllables or phonemes. In addition, while the units are acquired automatically, the acquisition process of concept of speech unit on human is taken into consideration, so the restriction such that "speech utterances to the same word are represented by the same unit sequence" is added. We acquired speech units using procudures of template matching, applied these units to speech recognition and found that it needs to take into consideration the variation of speech patterns. Therefore, we modelled the speech patterns corresponding to acquired units by using HMMs and higher recognition rates were obtained by training of HMMs with these units than those by the original template units. Therefore, a method to acquire units based on ergodic HMMs from an initial step was investigated in this paper. When we evaluated these units by word recognition experiments, a high recognition rate 99.5% at 216 words was obtained, with consideration of the above restriction at the both steps of acquisition process of units and registration of word dictionary (on the conditions of phoneme-like duration and 64 numbers of units). Finally, we compared the acquired units with phonetic units and found that the former is better than the latter on spoken word recognition.