BP-CRN: A Lightweight Two-Stage Convolutional Recurrent Network For Multi-channel Speech Enhancement

Cong PANG; Ye NI; Jia Ming CHENG; Lin ZHOU; Li ZHAO

doi:10.1587/transinf.2024EDL8042

This article has now been updated. Please use the final version.

BP-CRN: A Lightweight Two-Stage Convolutional Recurrent Network For Multi-channel Speech Enhancement

Cong PANG, Ye NI, Jia Ming CHENG, Lin ZHOU, Li ZHAO

Author information

Keywords: multichannel speech enhancement, lightweight, neural beamforming, convolutional recurrent network, complex network

JOURNAL FREE ACCESS Advance online publication

Article ID: 2024EDL8042

DOI https://doi.org/10.1587/transinf.2024EDL8042

The final version of this article is now available: Vol. E108.D (2025), No. 2 pp. 161-164

Details

Abstract

In our work, we propose a lightweight two-stage convolutional recurrent network (BP-CRN) for multichannel speech enhancement (mcse), which consists of beamforming and post-filtering. Drawing inspiration from traditional methods, we design two core modules for spatial filtering and post-filtering with compensation, named BM and PF, respectively. Both core modules employ a convolutional encoding-decoding structure and utilize complex frequency-time long short-term memory (CFT-LSTM) blocks in the middle. Furthermore, the inter-module mask module is introduced to estimate and convey implicit spatial information and assist the post-filtering module in refining spatial filtering and suppressing residual noise. Experimental results demonstrate that, our proposed method contains only 1.27M parameters and outperforms three other mcse methods in terms of PESQ and STOI metrics.

Corresponding author

Register with J-STAGE for free!