Joint Optimization of Perceptual Gain Function and Deep Neural Networks for Single-Channel Speech Enhancement

Wei HAN; Xiongwei ZHANG; Gang MIN; Xingyu ZHOU; Meng SUN

doi:10.1587/transfun.E100.A.714

Regular Section

Joint Optimization of Perceptual Gain Function and Deep Neural Networks for Single-Channel Speech Enhancement

Wei HAN, Xiongwei ZHANG, Gang MIN, Xingyu ZHOU, Meng SUN

著者情報

キーワード: speech enhancement, deep neural networks, perceptual gain function, joint optimization

ジャーナル認証あり

2017 年 E100.A 巻 2 号 p. 714-717

DOI https://doi.org/10.1587/transfun.E100.A.714

詳細

抄録

In this letter, we explore joint optimization of perceptual gain function and deep neural networks (DNNs) for a single-channel speech enhancement task. A DNN architecture is proposed which incorporates the masking properties of the human auditory system to make the residual noise inaudible. This new DNN architecture directly trains a perceptual gain function which is used to estimate the magnitude spectrum of clean speech from noisy speech features. Experimental results demonstrate that the proposed speech enhancement approach can achieve significant improvements over the baselines when tested with TIMIT sentences corrupted by various types of noise, no matter whether the noise conditions are included in the training set or not.

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）