This paper derives a continuous-space model to describe variations in magnitude of complex head-related transfer functions (HRTFs) along angles and radial distances throughout the horizontal plane. The radial part of this model defines a set of horizontal-plane distance-varying filters (HP-DVFs) that are used to synthesize the HRTFs for arbitrary sound source positions on the horizontal plane from initial HRTFs obtained for positions on a circular boundary at a single distance from the head of a listener. The HP-DVFs are formulated in terms of horizontal-plane solutions to the three-dimensional acoustic wave equation, which are derived by assuming invariance along elevation angles in spherical coordinates. This prevents the free-field inaccurate distance decay observed when assuming invariance along height in cylindrical coordinates. Furthermore, discontinuities along the axis connecting the ears are also overcome, which appear when assuming invariance along the polar angle in interaural coordinates. This paper also presents a magnitude-dependent band-limiting threshold (MBT) for restricting the action of filters to a limited angular bandwidth, which is necessary in practice to enable discrete-space models that consider a finite number of sources distributed on the initial circle. Numerical experiments using a model of a human head show that the overall synthesis accuracy achieved with the proposed MBT outperforms the one achieved with the existing frequency-dependent threshold, especially at low frequencies and close distances to the head.
A minimal model explaining intonation anomaly, or pitch sharpening, which can sometimes be found in baroque flutes, recorders, shakuhachis etc. played with cross-fingering, is presented. In this model, two bores above and below an open tone hole are coupled through the hole. This coupled system has two resonance frequencies ω±, which are respectively higher and lower than those of the upper and lower bores ωU and ωL excited independently. The ω± differ even if ωU= ωL. The normal effect of cross-fingering, i.e., pitch flattening, corresponds to excitation of the ω--mode, which occurs when ωL⪆ωU and the admittance peak of the ω--mode is higher than or as high as that of the ω+-mode. Excitation of the ω+-mode yields intonation anomaly. This occurs when ωL⪅ωU and the peak of the ω+-mode becomes sufficiently high. With an extended model having three degrees of freedom, pitch bending of the recorder played with cross-fingering in the second register has been reasonably explained.
Binaural systems are a promising class of three-dimensional (3D) auditory displays for high-definition personal 3D audio devices. They properly synthesize the sound pressure signals at the ears of a listener, namely binaural signals, by means of the head-related transfer functions (HRTFs). Rigid spherical microphone arrays (RSMAs) are widely used to capture sound pressure fields for binaural presentation to multiple listeners. However, the spatial resolution needed in the RSMAs to allow for accurate binaural reproduction has not been studied in detail. The aim of this paper is to objectively address this question. We evaluated the spatial accuracy in binaural signals synthesized from the recordings of RSMAs with different number of microphones using the model of a human head. We find that the synthesis of spectral cues is accurate up to a maximum frequency determined by the number of microphones. Nevertheless, we also identify a limit beyond which adding more microphones does not improve overall accuracy. Said limit is higher for the interaural spectral cues than for the monaural ones.