Acoustical Science and Technology
Online ISSN : 1347-5177
Print ISSN : 1346-3969
ISSN-L : 0369-4232
Volume 24, Issue 5
Special issue on Spatial hearing
Displaying 1-21 of 21 articles from this issue
FOREWORD
INVITED REVIEWS
  • Masanao Ebata
    Article type: Others
    Subject area: Others
    2003 Volume 24 Issue 5 Pages 208-219
    Published: 2003
    Released on J-STAGE: September 01, 2003
    JOURNAL FREE ACCESS
    Although there are several factors causing “cocktail party effect” after more than half a century of research, the major one is considered to be the spatial separation of the target signal and the interferer. This paper will overview developments of the improvement of performance resulting from the directional separation of the target signal from interferers when listening in a field or through headphones. The basic assumption concerning the cocktail party effect is that there are one or more interfering sound sources in addition to the target signal source. In this situation it is important to remember the selective attention effect, which attenuates the interfering sound by concentrating the attention on a specific signal. Pitch of sound is the simplest cue for selective attention; however, spatial information can also be one. The latter half of this review discusses the effect of spatial filtering and an attention filter on the frequency domain.
    Download PDF (152K)
  • William L. Martens
    Article type: Others
    Subject area: Others
    2003 Volume 24 Issue 5 Pages 220-232
    Published: 2003
    Released on J-STAGE: September 01, 2003
    JOURNAL FREE ACCESS
    This paper reviews development of spatial auditory display technology based upon 20 years of research evaluating digital filters designed to spatially position auditory images associated with sound sources presented via earphones. The motivation for this review was to attempt to provide clarification regarding some of the issues and assumptions that underlie such research-driven binaural technology development. After a general discussion of research goals and methods, instructive research results are presented to underscore the main points, especially with regard to questions that could serve to stimulate further useful work on directional filter design. These include questions of how best to customize filters for individual users, and, conversely, how to optimize filters for general use. Also considered are the related questions of how best to evaluate the performance of generic Head-Related Transfer Functions (HRTFs), in contrast to those that are measured for the use of a specific individual. At the heart of this review is a focus on the methods that are most appropriate for the evaluation of auditory imagery resulting from synthetic sound spatialization. While a primary goal for binaural synthesis is to spatially position an auditory image, methods typically employed to study the ability of human listeners to spatially localize an actual sound source address only a narrow subset of the issues that are important to the development of adequate spatial auditory display technology.
    Download PDF (138K)
INVITED PAPER
  • Jonas Braasch, Jens Blauert, Thomas Djelani
    Article type: Others
    Subject area: Others
    2003 Volume 24 Issue 5 Pages 233-241
    Published: 2003
    Released on J-STAGE: September 01, 2003
    JOURNAL FREE ACCESS
    The human ability to localize a direct sound source in the presence of reflected sounds is well known as localization dominance due to the precedence effect, formerly also called “the law of the first wavefront.” The fact that the localization dominance partly fails for signals with a very narrow bandwidth raises the question of whether the localization dominance requires cross-frequency-band interaction. To investigate this, a psychoacoustic experiment was conducted in which the perceived lateralization of a noise burst (±300-μs ITD) in the presence of a reflection (\mp300-μs ITD) was measured. The parameters in this experiment were the inter-stimulus interval (ISI) and the bandwidth. The first was varied among the following values: 0.0 ms and then from 1.0 ms in steps of 0.5 ms to 4.0 ms. The bandwidth was adjusted to be either 100 Hz, 400 Hz, or 800 Hz. The data of the listeners clearly show that localization dominance becomes more stable, especially with regard to the dependence on the ISI, when the bandwidth is increased. At the smallest bandwidth under examination (100 Hz), localization dominance cannot be observed for some of the listeners anymore, while others perceive at least the sound coming from the lateral side where the direct sound source is positioned; nevertheless, their auditory event varies strongly with the ISI. A signal analysis reveals that in the first case, the precedence effect does not seem to have any influence. Here, the listeners rather base their judgements on the ongoing part of the sound. The signal analysis also indicates that it is not necessary to assume that the signals in the different frequency bands interact directly with each other while being processed (e.g., cross-frequency coincidence units). It rather seems to be sufficient to average across those bands.
    Download PDF (213K)
PAPER
  • Setsu Komiyama, Yasushige Nakayama, Kazuho Ono, Satoru Koizumi
    Article type: Others
    Subject area: Others
    2003 Volume 24 Issue 5 Pages 242-249
    Published: 2003
    Released on J-STAGE: September 01, 2003
    JOURNAL FREE ACCESS
    This paper describes the use of a loudspeaker-array to control sound image distance. When every sound wave from the loudspeaker-array focuses at a point in front of the listener, a very near sound image is perceived regardless of the playback level in normal rooms. The auditory distance depends on the arrangement and number of loudspeakers. The number of loudspeakers determines the level of direct sound, whereas the interval between loudspeakers affects the total radiation or power of reflections in the room. As auditory distance of sound depends on the ratio between direct and late parts of the sound, the interval between loudspeakers is a very significant factor for producing near sound images as well as the number of loudspeakers.
    Download PDF (235K)
  • Jun Aoki, Haruhide Hokari, Shoji Shimada
    Article type: Others
    Subject area: Others
    2003 Volume 24 Issue 5 Pages 250-258
    Published: 2003
    Released on J-STAGE: September 01, 2003
    JOURNAL FREE ACCESS
    We examine multi-channel microphone arrangements to achieve precise and stable sound image localization in the horizontal plane when multi-loudspeakers are used. In this paper, six different coincident microphone arrays, cardioid microphones with different directions, are tested. We derive equations to model the system and define a system evaluation measure. The sound localization assessment shows that our equations approximately agree with the assessment results, and that the system evaluation measure must suit the microphone arrangement used. These results confirm that while the perception of lateral localization is difficult, three of the six arrays provide good sound localization. Last, we clarify that the coincident microphone array can also provide stable sound localization in multi-channel recording.
    Download PDF (373K)
  • Makoto Otani, Shiro Ise
    Article type: Others
    Subject area: Others
    2003 Volume 24 Issue 5 Pages 259-266
    Published: 2003
    Released on J-STAGE: September 01, 2003
    JOURNAL FREE ACCESS
    In the binaural method, sound signals are reproduced in each of the ears of a listener by using headphones. Usually, a dummy head microphone is used to record the signal to be reproduced. However, if the sound field to be reproduced does not exist, the dummy head recording method is not usable. In this case, we can produce the virtual sound field by calculating the input signals of the headphones from the head-related transfer functions (HRTFs) and the virtual sound image. Although the HRTFs can be obtained by measuring an impulse response between a loudspeaker and a listener’s ear in an anechoic chamber, it would be very useful if they could be calculated. In many researches that focus on predicting HRTFs by calculation, the boundary element method (BEM), which calculates the HRTFs by solving the wave equation using a computer model of the head, is rather prominent. However, one of the drawbacks of the BEM in obtaining the HRTFs is the considerable amount of calculation time that is required. In this study, we suggest a new method for HRTF calculation which enables significantly enhanced speed by pre-processing an inverse of the coefficient matrix of the BEM.
    Download PDF (219K)
  • Masayuki Morimoto, Kazuhiro Iida, Motokuni Itoh
    Article type: Others
    Subject area: Others
    2003 Volume 24 Issue 5 Pages 267-275
    Published: 2003
    Released on J-STAGE: September 01, 2003
    JOURNAL FREE ACCESS
    Morimoto and Aokata [J. Acoust. Soc. Jpn. (E), 5, 165–173 (1984)] clarified that the same directional bands observed on the median plane by Blauert occur in any sagittal plane parallel to the median plane. Based upon this observation, they hypothesized that the spectral cues that help to determine the vertical angle of a sound image may function commonly in any sagittal plane. If this hypothesis is credible, sound localization in any direction might be simulated by using head-related transfer functions (HRTFs) measured on the median plane to determine the vertical angle, and by using frequency-independent interaural differences to determine the lateral angle. In this paper, a localization test was performed to evaluate the hypothesis, and to examine a simulation method based on the hypothesis. For this test, stimuli simulating HRTFs measured on the median sagittal plane combined with interaural differences measured on the frontal horizontal plane were presented to the subjects. The results supported the hypothesis and confirmed that the experimental simulation was not only possible, but also quite effective in controlling sound image location.
    Download PDF (236K)
  • Shouichi Takane, Yôiti Suzuki, Tohru Miyajima, Toshio Sone
    Article type: Others
    Subject area: Others
    2003 Volume 24 Issue 5 Pages 276-283
    Published: 2003
    Released on J-STAGE: September 01, 2003
    JOURNAL FREE ACCESS
    A new theory for high definition Virtual Acoustic Display (VAD) based on a model called the “Virtual Sphere (VS) model” is introduced in this paper. This method is named ADVISE (Acoustic Display based on the VIrtual SpherE model). In ADVISE, a sphere-shaped boundary is defined around a listener, and the sound transmission from the sound source in the original field to the entrances of the listener’s ears is divided into two parts. One consists of Head-Related Transfer Functions (HRTFs) corresponding to the points on the boundary of the virtual sphere, and the other consists of Room Transfer Functions (RTFs) from the sound source to the points on the boundary. Then, these two kinds of transfer functions are convolved in real-time, taking into account the dynamic changes in these functions due to the listener’s head movement. The Kirchhoff-Helmholtz boundary integral equation is the theoretical basis of this idea. This equation represents that a sound field generated by sound sources outside a certain closed boundary can be synthesized by phantom monopole and dipole sources distributed on the boundary. In this paper, the theory of ADVISE is stated, then the features of ADVISE are described.
    Download PDF (156K)
  • Mitsuo Matsumoto, Mikio Tohyama, Hirofumi Yanagawa
    Article type: Others
    Subject area: Others
    2003 Volume 24 Issue 5 Pages 284-292
    Published: 2003
    Released on J-STAGE: September 01, 2003
    JOURNAL FREE ACCESS
    Previously introduced method of interpolating binaural implse responses and algorithm to simulate a moving sound image were evaluated objectively. The method interpolates the responses taking into account the arrival time difference due to changes in the direction of a moving sound source. For the angular interval of 15°, the average of the SDR values of our method, 23 dB was larger than that of the simple method, 9.9 dB. The variances of the SDR values showed our method interpolated the responses more independently of the azimuths of the sound source than the simple method. The responses interpolated using our method changed smoothly as the source direction changed. We have evaluated the algorithm by comparing a moving sound image simulated using the algorithm with an actual moving sound image recorded using a rotating dummy head and with a moving sound image simulated using a conventional method. The spectrogram of the binaural signal of the moving sound image, and no ripples were seen.
    Download PDF (419K)
  • Jonas Braasch, Jens Blauert
    Article type: Others
    Subject area: Others
    2003 Volume 24 Issue 5 Pages 293-303
    Published: 2003
    Released on J-STAGE: September 01, 2003
    JOURNAL FREE ACCESS
    In this investigation, different model algorithms were tested for their ability to simulate the precedence effect for ongoing (non-impulsive) noise-bursts of different bandwidths (100 Hz, 400 Hz and 800 Hz). The psychoacoustical reference data—in which the perceived lateral position of a noise burst (200-ms duration, 20-ms cos2-ramps, 500-Hz center frequency) in the presence of one reflection (inter-stimulus interval: 0.0 ms–0.4 ms) was determined—were taken from a preceding paper on this investigation. It is shown that models which simulate the precedence effect by using the special characteristics of the auditory periphery or by focusing on the spectral dominance region fail when stimuli of longer duration than clicks are used, while a modified Lindemann model still shows satisfactory results. Furthermore, it was found that the trading ratio of ITDs and ILDs can be assumed to be constant for the stimuli tested. A discounting of ITD cues, as was found by Rakerd and Hartmann (1985), was not observed for the type of stimuli tested here. Instead, a discounting of the precedence effect occurred for some of the listeners when the bandwidth of the signals was very narrow. In the model simulation, it was not necessary to consider cross-frequency-band interaction like the second coincidence weighting of Stern et al. (1988), and it was sufficient to estimate the average of the outputs of the involved frequency bands.
    Download PDF (1259K)
TECHNICAL REPORTS
  • Shouichi Takane, Shusuke Takahashi, Yôiti Suzuki, Tohru Miyajima
    Article type: Others
    Subject area: Others
    2003 Volume 24 Issue 5 Pages 304-310
    Published: 2003
    Released on J-STAGE: September 01, 2003
    JOURNAL FREE ACCESS
    A new theory for high definition Virtual Acoustic Display (VAD) based on a model called the “Virtual Sphere (VS) model” is introduced in this paper. This method is named ADVISE (Acoustic Display based on the VIrtual SpherE model). In ADVISE, a sphere-shaped boundary is defined around a listener, and the sound transmission from the sound source in the original field to the entrances of the listener’s ears is divided into two parts. One consists of Head-Related Transfer Functions (HRTFs) corresponding to the points on the boundary of the virtual sphere, and the other consists of Room Transfer Functions (RTFs) from the sound source to the points on the boundary. Then, these two kinds of transfer functions are convolved in real-time, taking into account the dynamic changes in these functions due to the listener’s head movement. The Kirchhoff-Helmholtz boundary integral equation is the theoretical basis of this idea. This equation represents that a sound field generated by sound sources outside a certain closed boundary can be synthesized by phantom monopole and dipole sources distributed on the boundary. In this paper, the theory of ADVISE is stated, then the features of ADVISE are described.
    Download PDF (163K)
ACOUSTICAL LETTERS
feedback
Top