日本音響学会誌
Online ISSN : 2432-2040
Print ISSN : 0369-4232
単音節音声認識装置における最適パラメータの一評価法
似鳥 寧信伊福部 達
著者情報
ジャーナル フリー

1984 年 40 巻 2 号 p. 63-70

詳細
抄録

Statistical evaluation method of monosyllabic voice identification rate is developed in order to investigate the characteristics of consonant identification and to find the optimal parameters in our voice recognition system. In our system, every monosyllable is converted into a time spectral pattern with 16 components in frequency and 16 components in time after procedures of an envelope matching, an extraction of a consonant, a time smoothing and a level normalizing. Each input pattern is refered to standard patterns following the same vowel as the input by means of the minimum square distance classification. Every input pattern is represented as a matirix X which is an element of a population of each monosyllable G_i (i≤15) which is supposed to be obeyed to a multi-dimensional normal distribution, and the error rate is estimated from the probability P_b(j/i) in which X in G_i belongs to the space of G_j beyond a discriminating plane between two populations G_i and G_j. From experimental results by 30x15 monosyllables following vowel /a/ pronounced by a male speaker aged 25, characteristics of the estimated error rate coincides with that of an experimental data, our envelope matching method is proved to have almost the same effect as the shift matching, and the optimal values and method are obtained with respects to an extraction part of consonant, window length of the time smoothing and the level normalizing method by calculating the error rates in various parameters. Furthermore, it is found that the unvoiced plosives /k/, /t/ and /p/ have the different optimal parameters from the other consonants in the above evaluations, so that the different pre-processing method will be needed for the unvoiced plosives in order to make the identification rate increase.

著者関連情報
© 1984 一般社団法人 日本音響学会
次の記事
feedback
Top