Long reverberation degrades the intelligibility of speech sounds. Previous studies have reported that non-native listeners have difficulty in understanding speech in reverberation more than native listeners. In the results of the previous studies, there was the possibility that lower identification scores for non-native listeners were attributed to non-native phonemes which did not exist in their native languages. The current study investigated the identification of Japanese consonant-vowel (CV) syllables in reverberation for native English listeners whose native language has counterparts to most or all Japanese consonants. The current study used 62 CV syllables as stimuli. The reverberation time of the reverberant condition was 2.7 s. The results showed that the correct answer rate for non-native listeners declined in the reverberant condition more than that for native listeners. There was significant difference between native and non-native listeners in the correct answer rate of /m, r, k, d, s/ in reverberation. The results suggested that non-native listeners had disadvantage in listening to non-native consonants even if their native languages had counterparts to the consonants. In addition, the results suggested that native English listeners might had advantage in finding acoustic cues of place of articulation in adverse environments because of the inventory of English.
Previous studies on noise-vocoded speech showed that the temporal modulation cues provided by the temporal envelope play an important role in the perception of vocal emotion. However, the exact role that the temporal envelope and its modulation components play in the perceptual processing of vocal emotion is still unknown. To clarify the exact features that the temporal envelope contributes to the perception of vocal emotion, a method based on the mechanism of modulation frequency analysis in the auditory system is necessary. In this study, auditory-based modulation spectral features were used to account for the perceptual data collected from vocal-emotion recognition experiments using noise-vocoded speech. An auditory-based modulation filterbank was used to calculate the modulation spectrogram of noise-vocoded speech stimuli, and ten types of modulation spectral features were then extracted from the modulation spectrograms. The results showed that there were high similarities between modulation spectral features and the perceptual data of vocal-emotion recognition experiments. It was shown that the modulation spectral features are useful for accounting for the perceptual processing of vocal emotion with noise-vocoded speech.
Modeling of elastic boundary support is crucial for simulating realistic vibro-acoustical behaviors of plate-like structures. In this paper, the mechanical and moment impedances of an elastic support material are derived in closed form under several assumptions, and three basic studies are conducted on a vibration system of a thin plate supported with an elastic material. First, bending wave reflection from the impedance boundary is theoretically analyzed to clarify the incidence angle dependence of vibration energy absorption coefficient. Second, the proposed impedance model is validated in comparison with the precise finite element model of the elastic support material. Finally, as an application of the impedance model, loss factor measurement is numerically modeled, which reveals that the calculated loss factors are generally greater than the theoretical values for the diffuse vibration field.
Wind noise annoys hearing-aid users, and it is hard to attach a windscreen to a hearing-aid microphone, for cosmetic reasons. Some hearing-aid devices reduce the low-frequency components of input signals by using high-pass filters to suppress the wind noise. Although wind noise can be attenuated by this approach, the perceived binaural information of the desired signals will also be degraded simultaneously, resulting in partial information loss. We had previously proposed a short-time fast-Fourier-transform-based (STFT-based) binaural wind noise cancellation algorithm that preserves binaural cues. This algorithm required a frame length of 32 ms to maintain a high frequency resolution. However, it is known that the tolerable group delay for mild hearing loss should be less than approximately 5 ms, in the high-frequency region. In this paper, we propose a low-delay binaural wind noise cancellation algorithm that uses a frequency-warping filter. The processing latency of this algorithm is shorter than the tolerable delay. The objective evaluation results – signal-to-noise ratios and perceptual evaluation of speech quality (PESQ) scores – were improved while maintaining a low latency. Subjective experiments demonstrated that the proposed method produced almost the same score as our previous STFT-based method, in terms of the directionality of output signals.
In this study, we examine the changes in loudness due to the arrival direction and distribution width of a sound. Loudness is conventionally measured by using an omnidirectional microphone. However, this measurement does not take into account the influence of diffraction caused by a listener's head and auricles. That is, it does not include any information on the arrival direction and distribution width of a sound. Loudness was measured by taking the direction and width into account with a dummy head, and subjective tests were conducted to compare the test results with those measured by using the head. It is revealed that loudness varies depending on the arrival direction and distribution width of a sound source, and it differs from that measured by using an omnidirectional microphone. Additionally, it is possible to quantitatively determine the variation in loudness (as a sound pressure level) from the arrival direction and distribution width of a sound source for noise evaluation.
To measure the normal incident sound absorption coefficient and transmission loss for frequencies around 10 kHz, a small impedance tube with an inner diameter of 15 mm was developed. It was found that the influences of sound attenuation, caused by the viscosity of air near the inner tube wall and the surface roughness on the wall of the tube at microphone, were non-negligible. To address these issues, we propose a correction method for the sound attenuation and an improved microphone holder to smoothen the inner wall surface. It was then possible to accurately measure the sound absorption coefficient and transmission loss with acceptable level of accuracy.