Modern convolution technologies offer possibilities to overcome principle shortcomings of loudspeaker stereophony by exploiting the Wave Field Synthesis (WFS) concept for rendering virtual spatial characteristics of sound events. Based on the Huygens principle loudspeaker arrays are reproducing a synthetic sound field around the listener, whereby the dry audio signal is combined with measured or modelled information about the room and the source’s position to enable the accurate reproduction of the source within its acoustical environment. Not surprisingly, basic and practical constraints of WFS systems limit the rendering accurateness and the perceived spatial audio quality to a certain degree, dependent on characteristic features and technical parameters of the sound field synthesis. However, recent developments have shown already that a number of applications could be possible in the near future. An attractive example is the synthesis of WFS and stereophony offering enhanced freedom in sound design as well as improved quality and more flexibility in practical playback situations for multichannel sound mixes.
This paper presents an outline of the sound production mechanisms in wind instruments and reviews recent progress in the research on different types of wind instruments, i.e., reed woodwinds, brass, and air-jet driven instruments. Until recently, sound production has been explained by models composed of lumped elements, each of which is often assumed to have only a few degrees of freedom. Although these models have achieved great success in understanding the fundamental properties of the instruments, recent experiments using elaborate methods of measurement, such as visualization, have revealed phenomena that cannot be explained by such models. To advance our understanding, more minute models with a large degree of freedom should be constructed as necessary. The following three different phenomena may be involved in sound production: mechanical oscillation of the reed, fluid dynamics of the airflow, and acoustic resonance of the instrument. Among them, our understanding of fluid dynamics is the most primitive, although it plays a crucial role in linking the sound generator with the acoustic resonator of the instrument. Recent research has also implied that a rigorous treatment of fluid dynamics is necessary for a thorough understanding of the principles of sound production in wind instruments.
Methods for studying the modes of vibration and sound radiation from percussion instruments are reviewed. Recent studies on the acoustics of marimbas, cymbals, gongs, tamtams, lithophones, steelpans, and bells are described. Vibrational modes and sound radiation from the HANG, a new steel percussion instrument are presented.
Recent research on the acoustics of the piano are reviewed focusing on the topics which were presented at the International Symposium on Musical Acoustics in Nara (ISMA2004) and the International Conference on Acoustics in Kyoto (ICA2004) which were held in Japan from late March to the beginning of April in 2004. The topics include the secondary partials in piano tones, string excitation by the hammer, and the coupling between the strings, the bridge and the soundboard. The existence of the secondary partials was known since late 1970s and called ‘phantom partials’ in a paper published in 1997.
Absolute pitch (AP) is the ability based on the fixed association between musical pitch and its verbal label. Experiments on AP identification demonstrated extreme accuracy of AP listeners in identifying pitch, influences of timbre and pitch range, and difference in accuracy between white-key notes and black-key notes. However, contrary to the common belief that AP is a component of musical ability, it was found that AP listeners have difficulty in perceiving pitch relations in different pitch contexts, and in recognizing transposed melodies, as compared to listeners having no AP. These results suggest that AP is irrelevant and even disadvantageous to music. Systematic music training in early childhood seems effective for acquiring AP. Possible genetic contributions to AP are undeniable, but evidence for them is inconclusive. There are several AP-like phenomena that do not reach consciousness: absolute tonality, long-term memory of pitch of repeatedly heard tunes, specific patterns of pitch comparison in the tritone paradox, and fixed pitch levels in speech. Contrary to true AP observed as a pitch naming ability, the implicit AP phenomena are widespread among general population.
This review describes cross-cultural studies of pitch including intervals, scales, melody, and expectancy, and perception and production of timing and rhythm. Cross-cultural research represents only a small portion of music cognition research yet is essential to i) test the generality of contemporary theories of music cognition; ii) investigate different kinds of musical thought; and iii) increase understanding of the cultural conditions and contexts in which music is experienced. Converging operations from ethology and ethnography to rigorous experimental investigations are needed to record the diversity and richness of the musics, human responses, and contexts. Complementary trans-disciplinary approaches may also minimize bias from a particular ethnocentric view.
It is well known that the mean force due to the Langevin radiation pressure on a sphere freely placed in a plane progressive sound field is always positive force (i.e., repulsive force). In the case of spherical diverging field, however, the situation is quite different. At very large distances from the source the radiation force obeys an inverse square law of repulsion. As the source of the field is approached, the repulsion decreases to zero and then becomes a force of attraction [T. F. W. Embleton, J. Acoust. Soc. Am., 26, 40–45 (1954)]. The present paper discusses the mechanisms for the attracting force acting on a rigid sphere placed freely in a spherical diverging sound field. The distribution of the three components of the radiation force on a sphere (i.e., kinetic energy density K, potential energy density U, and tensor term T that includes momentum flux density) is investigated and compared with the case of plane progressive sound field. In the cases of grazing incident of plane waves and spherical waves, it is shown theoretically that the contribution of the tensor term vanishes and the Lagrangean density L=K−U becomes the only cause of radiation force. In the case of incident plane waves, the effect of 〈L〉 vanishes because 〈K〉=〈U〉, where the symbol 〈〉 denotes the time-averaging operation, while attracting force arises because 〈K〉>〈U〉 in the case of incident spherical waves.
“Phase ambiguity” leads to confusion in computational source localization where multiple source locations are introduced from a cross-spectral phase value measured by two sensors at high frequencies, where sound wavelength is shorter than sensor interval. In this paper, a frequency domain algorithm for broadband source localization by two sensors is proposed for solving “phase ambiguity” confusion under actual conditions. Using the overlapped and averaged phase differences of the cross-spectral phases measured over the audible frequency range, multiple source azimuths are identified from each phase difference as much as possible over the azimuth range of ±90°. The frequency-independent azimuths extracted from multiple azimuths by Hough transformation provide the target source azimuths. The azimuths for the two loudspeakers can be identified simultaneously in this way from the phase differences measured over the full audible frequency range within an error of approximately 6° under reverberative conditions. By removing the numerical noise during source azimuth identification, the estimated source distribution corresponds to the diameters of the loudspeakers. When it is necessary to distinguish between near or far sound sources around the microphones, the horizontal azimuths for the sources can be precisely identified from all directions except at approximately ±90° if judgment of the front or back is given.
The interference effect of nonspeech and speech in short-term auditory memory was investigated with an experiment paradigm proposed by Deutsch [Science, 168, 1604–1605 (1970)], in which test tones, separated by a 5-s retention interval, were interpolated with six other sounds. In Experiment 1, the test tones were pure tones. The interpolated sounds were pure tones and naturally spoken digits by a female and a male. Nine participants were tested for (1) pure-test-tone pitch recognition, (2) serial recall of the interpolated spoken digits, and (3) both tasks (1) and (2). Pitch recognition errors were significantly increased in task (3) compared to task (1), and the digit recall errors were also significantly increased in task (3). In Experiment 2, the test tones were eight-component harmonic complex tones. The interpolated sounds were eight-component harmonic complex tones, and naturally spoken digits by a male. Twelve participants were tested for the corresponding task conditions as in Experiment 1. Significant increases in the errors of pitch recognition and of digit recall were observed when both tasks were required. These results suggest that speech can interfere with tone pitch in short-term auditory memory, and that pitch salience plays a crucial role in the interference.
Measurement of vocal tract area functions from MRI data requires a technique for tooth visualization because the teeth are as transparent as air in the images. In this article, a new method is proposed to accurately superimpose the teeth onto MRI volume data. Upper and lower tooth images with the surrounding bony structure are obtained by scanning a subject holding a contrast medium in the oral cavity. They are superimposed onto the target volume data via a three-dimensional transformation using landmarks sampled from the tooth images and target MRI data. The accuracy of the dental image superimposition is ensured by the minimization of the error volume, which is the mismatch volume of the dental image overlapping the surrounding soft tissue. The method is evaluated using five operators sampling the landmarks. Results show that the error volume is significantly reduced to a nearly constant value regardless of the operator’s skill.