Frequency domain binaural model based on interaural phase and level differences

Hidetoshi Nakashima; Yoshifumi Chisaki; Tsuyoshi Usagawa; Masanao Ebata

doi:10.1250/ast.24.172

Abstract

We can communicate with others in a noisy environment. This phenomenon is known as a “Cocktail Party Effect” and is one of the most important binaural functions. This paper addresses a frequency domain binaural model that plays the role of a binaural function based on an interaural phase and level difference. The proposed model is evaluated not only as a front-end of the speech recognition system, but also as a speech enhancer. According to the evaluation, when the direction of arrival of the target signal and noise differs by 10^°, recognition rates improve in comparison with the previous time domain binaural model (TDBM) in any cases. Furthermore, recognition rates show more than 90% when the signal to noise ratio (SNR) is higher than approximately 5 dB. On the other hand, SNR and coherence of the frequency domain binaural model, which is obtained for an evaluation of the speech enhancer, show superior results over the TDBM.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!