Improvement in Bone-Conducted Speech Restoration Using Linear Prediction and Long Short-Term Memory Model

Huy Quoc Nguyen; Masashi Unoki

doi:10.2299/jsp.24.175

抄録

Bone-conducted (BC) speech has a significant advantage as a solution for speech communication in an extremely noisy environment because of its stability against surrounding noise. However, the quality and intelligibility of BC speech degrade, making BC speech difficult to restore. To solve this problem, we propose a method for restoring BC speech with a combination of a linear prediction (LP) model using line spectral frequencies (LSFs) and a long short-term memory (LSTM) model. We evaluated the method using three objective measurements: log-spectrum distortion, LP coefficient distance, and a perceptual evaluation of speech quality. The results of all three measurements show that our method is better than the previous method, which used a simple recurrent network. These results also show that the model can yield speech with better quality when the LP gain is estimated more accurately.

著者関連情報

お気に入り & アラート

閲覧履歴

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）