抄録
We have developed a real-time speech visualization system called “KanNon”[1,2] which supports speech communication of hearing-impaired people. The KanNon system presents informations of the speech such as loudness, pitch, sound spectrogram and characters by speech recognition system in real-time. In the present KanNon system, a word-unit speech recongniton system using large scale dictionary is adopted. However, the KanNon system is required quick and simple display of speech contents for smooth communication. For this purpose, we applied phonemic speech recognition system. Also, we have already proposed Japanese 5 vowels (/a/, /i/, /u/, /e/, /o/) recognition methods, applying “Time-Delay Neural Network (TDNN)” [3] and statistical pattern recognition [4].However, correct recognition rate is about 85 percent shown in Tables 1, 2 which is not so high. In this paper, therefore, we attempt to obtain better spectral features for phenemic recognition, we apply the novel spectral estimation method called Burg-MCE[5] method combining Burg method and Minimum Cross Entropy method. We apply human auditory property to power spectrum estimated by Burg-MCE method, and carry out phonemic recognition by using statiscal pattern recognition.