電気学会論文誌C(電子・情報・システム部門誌)
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
<音声画像処理・認識>
非音声区間拡張マルチコンディション単語モデルの雑音ロバスト性に関する実験的評価
早坂 昇宮永 喜一
著者情報
ジャーナル フリー

2012 年 132 巻 10 号 p. 1667-1674

詳細
抄録
Voice activity detection (VAD) is an essential technique to develop a sophisticated voice interface. However, VAD with sufficient detection capability has not been presented yet. In particular, it is difficult that the beginning and ending of a word are accurately detected in noisy environments. In this paper, we describe extended models with multi-condition training (extended MC-models) for misdetection and evaluate their noise robustness by a large amount of word recognition simulations. From the results of the simulations, simple whole-word models degraded recognition performance when input speech signal was accompanied by non-speech segments, whereas the extended MC-models maintained the performance. Furthermore, in consideration of practical applications, we carried out the simulations combining CENSREC-1-C baseline VAD with the extended MC-models. The results also showed the usefulness of the extended MC-models under 20 and 10dB signal-to-noise ratio conditions.
著者関連情報
© 2012 電気学会
前の記事 次の記事
feedback
Top