2019 Volume 40 Issue 2 Pages 84-92
The speech-based envelope power spectrum model (sEPSM) was developed to predict the speech intelligibility of sounds produced by nonlinear speech enhancement algorithms such as spectral subtraction. It is a linear model with a linear, level-independent gammatone (GT) filterbank as the front-end. Therefore, it seems difficult to evaluate speech sounds with low and high sound pressure levels (SPLs) consistently because the intelligibility of the speech is dependent on the SPL as well as the signal-to-noise ratio. In this study, the sEPSM was extended with the dynamic compressive gammachirp (dcGC) auditory filterbank and a ``common'' normalization factor of the modulation power spectrum component to improve the predictability of the model. For evaluating the proposed model, we performed subjective experiments on the intelligibility of speech sounds enhanced by spectral subtraction and a Wiener filter algorithm. We compared the subjective speech intelligibility scores with the objective scores predicted by the proposed dcGC-sEPSM, original GT-sEPSM, and other well-known conventional methods such as the short-time objective intelligibility measure (STOI), coherence speech intelligibility index (CSII), and hearing aid speech perception index (HASPI). The result shows that the proposed dcGC-sEPSM predicted the subjective results better did than the other methods.