多話者環境における目的音声抽出の有効性と課題 ―ろう・難聴者の主観評価に基づく分析―

小林 彰夫; 藤江 匠汰; 安 啓一

doi:10.11184/his.28.2_165

抄録

This study quantitatively investigates the effects of a neural network-based target-speech extraction method on listening effort among deaf and hard-of-hearing (DHH) participants in Japanese multispeaker environments. A five-point listening-effort test was conducted with 22 participants (11 DHH, 11 normal-hearing), who rated both mixtures and extracted speech samples across three signal-to-noise ratios (SNRs; 0, 10, and 20 dB) and two speakers. To appropriately handle ordinal-scale ratings, we employed ordinal logistic mixed models and their hierarchical Bayesian extensions, jointly examining the interactions among SNR, speaker, and participant group. The models showed consistent trends: target-speech extraction benefits were most pronounced at low to mid SNRs, whereas group differences between DHH and normal-hearing participants were observed at high SNR (20 dB). These findings clarify when target-speech extraction most effectively reduces listening effort for DHH participants, providing engineering insights into speech-enhancement technologies for inclusive and accessible real-world speech communication.

著者関連情報

お気に入り & アラート

閲覧履歴

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）