日本ロボット学会誌
Online ISSN : 1884-7145
Print ISSN : 0289-1824
ISSN-L : 0289-1824
MFTを用いたロボットの動作中における音声認識
西村 義隆石塚 満中臺 一博中野 幹生辻野 広司
著者情報
ジャーナル フリー

2007 年 25 巻 8 号 p. 1189-1198

詳細
抄録
Automatic speech recognition (ASR) is essential for human-humanoid communication. One of the main problems with ASR by a humanoid is that it is inevitably generates motor noises. These noises are easily captured by the humanoid's microphones because the noise sources are closer to the microphones than the target speech source. Thus, the signal-to-noise ratio (SNR) of input speech becomes quite low (sometimes less than 0 [dB] ) . However, it is possible to estimate these noises by using information on the humanoid's motions and gestures. This paper proposes a method to improve ASR for a humanoid with motor noises by utilizing its motion/gesture information. The method consists of noise suppression and missing-feature-theory-based ASR (MFT-ASR) . The proposed noise suppression technique is based on spectral subtraction, and a white noise is added to blur distortion of suppression. MFT-ASR improves ASR by masking unreliable acoustic features in the input sound. The motion/gesture information is used for obtaining the unreliable acoustic features. Furthermore, we also evaluated with the acoustic model adaptation technique called MLLR (Maximum Likelihood Linear Regression) . Un-supervised MLLR was used for the adaptation. We evaluated the proposed method through recognition of speech recorded by using Honda ASIMO in a room with reverberation. The noise data contained 34 kinds of noises: motor noises without motions, gesture noises, walking noises, and other kind of noises. The experimental results show that the proposed method outperforms the conventional multi-condition training technique.
著者関連情報
© 社団法人 日本ロボット学会
前の記事 次の記事
feedback
Top