Humans, and other mammals, make use of three cues to localise sound sources. Two of these are binaural, involving a comparison of the level and/or timing of the sound at each ear. For high frequencies, level differences result from shadowing by the head. For low-frequencies, localisation relies on the time differences between the signals at the ears that result from different sound paths to the ears. The third cue depends on sensitivity to the elevation-dependent pattern of spectral peaks and troughs that result from multiple sound waves interfering at the tympanic membrane. Different physiological mechanisms process these different localisation cues. Neurons in the dorsal cochlear nucleus are selectively sensitive to the spectral notches that result from interference between sound waves at the ear. Interaural level differences are initially processed in the lateral superior olive by neurons receiving inhibition from one ear and excitation from the other. Interaural time differences are converted into discharge rate by neurons in the medial superior olive with excitatory inputs from both ears and that only fire when their inputs are coincident. The contribution of such coincidence detectors to sound-source localisation is discussed in the light of recent observations.
The feasibility of using the formant analysis-synthesis approach to replace the voicing sources of esophageal speech was explored. Using inverse-filtered signals extracted from normal speakers provided the voicing sources. Pitch extraction was tested with various pitch extraction methods, and then a computationally simple, band-limited auto-correlation method was chosen. To accomplish stable and practical speech enhancement, the input signal was divided into low- and high-frequency channels, then only the low-frequency channel was processed by the formant analysis-synthesis method. A special purpose DSP-hardware unit was designed to perform the proposed analysis-synthesis process in real-time. Subjective evaluation tests (rating scale method) have been made with seven well-trained esophageal speakers and three speech therapists. Results of the subjective test showed that the synthesized speech was significantly improved, especially in cases of “loudness”, “sonority”, “strained”, “stoma noise”, “choppy”, “stability”, “intelligibility”, “recognizability”, and “duration” features.
Generally, people tend to think that absolute pitch possessors or musical experts are superior in all of their auditory and musical abilities. However, not much data has been shown to prove whether any differences exist in the basic hearing abilities of absolute pitch possessors or musical experts for general sounds outside the context of music. In this study, we conducted four experiments to investigate if absolute pitch possessors or music experts are superior in their basic hearing abilities with respect to frequency resolution, temporal resolution, and spatial resolution. In Experiments 1 and 2, we measured frequency discrimination thresholds and thresholds to detect a tone in notched noise to examine the characteristics of frequency resolution. In Experiment 3, we conducted a gap detection task to measure the temporal resolution. Finally, in Experiment 4, we measured interaural time difference discrimination thresholds to examine the spatial resolution abilities. The overall results show that there were no significant differences in frequency, temporal and spatial resolutions among groups with different absolute pitch capability. This indicates that absolute-pitch possessors do not have particularly ‘good ears’ in terms of resolution.
Adaptive algorithms such as LMS are often used in active noise control systems to update the output of secondary sources. If we apply the active scheme to traffic noise control, it is important to understand the behavior of the adaptive algorithm in response to significant changes of the primary noise due to movement of the noise source. This paper describes preliminary investigations of the tracking ability of adaptive algorithms whilst the noise source is moving. The well-known filtered-x LMS, NLMS and RLS algorithms are used in the simulations. In addition, experiments are conducted to verify the results of the simulations.
In the present study, a simple and efficient method for estimating frontal positions of the Kuroshio extension (KE) is proposed using relationship between meridional shifts of the KE and acoustic travel time propagating through the SOFAR channel. Effectiveness of the proposed method is estimated by sound propagation simulation using the temperature and salinity fields from observations in the KE, and demonstrates propriety of this method. Applying this method to in situ ocean acoustic tomography data in the KE region in summer 1997 provides the proper estimates of the KE positions. This simple method may be utilized in determining “initial guess of the KE position” and “reference sound speed field” that are necessary for ocean acoustic tomographic inversion which is much more complicated and time-consuming data analysis.
In order to simulate three-dimensional sound fields in laboratory experiments, a 6-channel recording/reproduction system has been contrived. To record the sound in a real sound field, six uni-directional microphones combined at every 90 degrees are used. As the reproduction system, six loudspeakers are set in an anechoic room and the recorded signals in each direction are reproduced. The advantages of this system are that the principle is quite simple and the listening area is not strictly limited. In this paper, the principle of the system and the reproduction accuracy are reported.