Deep Complex-Valued Neural Network-Based Triple-Path Mask and Steering Vector Estimation for Multichannel Target Speech Separation

Mohan Qin; Li Li; Shoji Makino

doi:10.2299/jsp.27.87

抄録

We propose a deep complex-valued neural network-based beamforming framework for multichannel target speech separation. The deep complex-valued neural network predicts steering vectors and complex ratio masks for speaker signals. The masked signals are then used to calculate the spatial covariance matrices needed for minimum variance distortionless response (MVDR) beamforming. We propose triple-path modeling for mask estimation, which takes both intrachannel and interchannel features into consideration. Our experimental results revealed that the proposed framework achieves better target speech separation performance than do the baseline methods.

著者関連情報

お気に入り & アラート

お気に入りに追加
追加情報アラート
被引用アラート
認証解除アラート

閲覧履歴

Behaviors and risk assessment of anti-influenza virus drugs in upstream section of Neya River
Investigating Risk Communication Process for Community's Disaster Reduction with a Framework of "Communicative Survey Method"
Characterization of Surface and Bulk of Cattle Bone-oringinated Hydroxyapatite
RICCI SOLITONS AND REAL HYPERSURFACES IN A COMPLEX SPACE FORM
Intracellular Ca²⁺ Responses in Cultured Endothelial Cells to Mechanical Stimulation by Laser Tweezers

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）