We present a new method for implementing transaural sound reproduction systems by using feedback control theory. The H∞ control theory is employed to synthesize the feedback controller. The structure of the sound reproduction system is formulated such that the H∞ norm of the system transfer function, which is to be minimized with feedback control, expresses the difference between the desired signals and the acoustic signals reproduced at the ears of a listener via loudspeakers. Modeling errors and plant perturbations resulting from the movement of the listener’s head are also taken into account to ensure robust stability. Computer simulations indicate that the equalization and cross-talk cancellation by the proposed method are better preserved for deviations in position than are those by the conventional inverse filtering method.
We examined the relationship between the summing-localization behavior and perceived width of a sound image via two hearing experiments based on relative comparisons of the perceived directions to two sound stimuli. Two types of sound stimuli were employed. One stimulus, the “composite stimulus,” was generated by two sound sources located in horizontal plane with time lag (−3.0 to 3.0 ms) between them. The composite stimulus produced a perception of the summing localization or precedence effect. The other stimulus was generated by a single sound source as the reference for the perceived sound image. The first experiment (Exp. 1) investigated the relationship between the perceived direction of the composite stimulus and the time lag between the sound sources. The second experiment (Exp. 2) was carried out in order to compare the perceived width of the composite stimulus to that of the reference stimulus. The results of Exp. 1 demonstrated that the perceived direction of the composite stimulus smoothly shifted from the middle of the sound sources to that generating the preceding sound, these findings roughly corresponded to the results of the past studies. However, the tendency of the shift was different among the subjects, and could be classified into two types according to whether the shift reached the direction of the preceding source. The results of Exp. 2 showed that the “included range,” defined as the range for which the direction of the reference stimulus is perceived as included within the fused image of the composite stimulus, differed for each subject. The maximum width of the included range was over 30 degrees. From the results of Exps. 1 and 2 taken together, it is clear that there are individual differences with respect to how direction of a stimulus is perceived, within the included range of the fused sound image. This implies that the difference in the answer policy of each subject makes this difference.
In this paper, we propose a method of automatically measuring the segmental duration characteristics of a second-language learner’s speech as a means to evaluate language proficiency in terms of speech production. We propose the use of duration differences from native speakers’ speech as an objective evaluation score to evaluate the learner’s English segmental duration characteristics. To provide flexible evaluation without the need to collect any additional native-English reference speech, we employed predicted normalized segmental durations using a statistical duration model instead of measured raw durations of native reference speech. The proposed evaluation method was tested using English speech data uttered by multiple Thai-native learners’ groups with different amounts of experience of English study in English-as-an-official-language countries. An evaluation experiment showed that the proposed measure based on duration differences is strongly correlated with the amount of English study. Moreover, segmental duration differences revealed Thai learners’ speech-control characteristics such as stress assignment on word-final syllables. These results support the effectiveness of the proposed model-based objective evaluation.
Most multimedia applications involving auditory displays for digital video contents rely on audio signal processing and conventional stereo loudspeakers to create sound images on the screens. Although a number of examples of these techniques have been reported in the literature showing successful results for narrow listening areas, it seems that the configuration of the loudspeakers plays a significant role when correct sound image localization is desired at wider sound fields. This paper presents a new loudspeaker design that enhance the localization of sound images on the surface of a flat screen for users distributed in a wide area. A preliminary validation of the design was performed numerically using boundary element methods, and experimentally using a prototype model. The results show that, in spite of its simplicity, the design effectively alters the radiated sound field so as to expand the listening area.
The thermal diffusivities of transparent polymer films were measured using a laser induced thermal wave generated on a carbon substrate. This method easily takes into account the thermal impedance difference between a transparent polymer film and carbon substrate. The root modulation frequency dependence of the phase delay of thermal waves in position with and without a transparent polymer film was investigated. In this paper, we describe the principle of the measurement method and several equations used in the evaluation. Finally, the thermal diffusivities of the transparent polymer films, such as poly(vinylidene fluoride), poly(ethyleneterephthalate), and polyimide, were investigated by this method.