2023 年 143 巻 8 号 p. 830-841
In this paper, we made improvements and evaluated our proposed model for non-reference speech intelligibility estimation on reverberant speech, attempting to improve the estimation accuracy significantly. Our proposed method consists of a DNN for speech enhancement and a separate DNN for intelligibility estimation. The latter uses features obtained from enhanced and degraded speech to estimate intelligibility. Although previous studies have effectively estimated intelligibility for speech degraded by additive noise using similar models, they did not consider the degradation of distortion caused by reverberation. They also did not quantify the effect of various speech enhancement DNN models, the structure of the intelligibility prediction DNN, and the selection of parameters during feature calculation on estimation accuracy. Accordingly, we compared two top-of-the-line speech enhancement DNN models and used their output to train intelligibility prediction DNNs for reverberant speech while also varying the parameters used in the feature calculation. Consequently, the linear correlation coefficient between subjective and estimated intelligibility came to 0.801 with the best combination.
J-STAGEがリニューアルされました! https://www.jstage.jst.go.jp/browse/-char/ja/