Host: The Japan Society of Mechanical Engineers
Name : [in Japanese]
Date : June 05, 2019 - June 08, 2019
This paper presents a monaural speech enhancement method for a hose-shaped rescue robot based on a deep speech prior. Speech enhancement is crucial to make a robot operator succeed in detecting human voices because audio signals captured by a microphone on the robot are contaminated by ego-noise. We have been developed three enhancement methods: 1) a blind speech enhancement called robust nonnegative matrix factorization (RNMF), 2) an extension of RNMF with a pre-trained noise model, and 3) another extension of RNMF with a deep speech prior, i.e., a pre-trained speech model based on deep learning. In this paper, we develop a new extension of RNMF by combining the pre-trained noise and speech models as a unified model and evaluated these methods on a hose-shaped rescue robot whose ego-noise consists of vibration-motor and air-jet noise. Experimental results show that the new method outperforms the three RNMF methods when the signal-to-noise ratio is equal to or less than +5 dB.