Acoustical Science and Technology

PAPERS

Dominant region for pitch at low fundamental frequencies (F0): The effect of fundamental frequency, phase and temporal structure

Hiromitsu Miyazono, Brian R. Glasberg, Brian C. J. Moore

2009Volume 30Issue 3 Pages 161-169
Published: May 01, 2009
Released on J-STAGE: May 01, 2009

DOIhttps://doi.org/10.1250/ast.30.161

JOURNAL FREE ACCESS

Show abstractHide abstract

When a complex tone contains many harmonics, its pitch is usually determined by harmonics in a restricted frequency region, called the “dominant region,” which for fundamental frequencies (F0s) ≥ 100 Hz corresponds to low, resolved harmonics. We estimated the dominant region for tones with low F0, by measuring thresholds, F0DLs, for detecting a change in F0 of a group of harmonics embedded within harmonics with fixed F0. The spectral position of the shifted group was systematically varied. Components were added in either cosine or random phase. For F0s of 35 and 50 Hz, the position of the dominant region depended strongly on the relative phases of the components. When the envelope had a low peak factor, with multiple peaks per period (random phase), the dominant region fell at low harmonic numbers (for F0=50 Hz), or was not well defined (for F0=35 Hz). When the envelope had a high peak factor, with one peak per period (cosine phase), the dominant region fell at high harmonic numbers, where harmonics were unresolved. Generally, performance was better for cosine than for random phase. The results indicate that harmonics in the dominant region are not always resolved.

View full abstract

Download PDF (528K)
A flexible spectral modification method based on temporal decomposition and Gaussian mixture model

Binh Phu Nguyen, Masato Akagi

2009Volume 30Issue 3 Pages 170-179
Published: May 01, 2009
Released on J-STAGE: May 01, 2009

DOIhttps://doi.org/10.1250/ast.30.170

JOURNAL FREE ACCESS

Show abstractHide abstract

Manipulating spectral structure often leads to degradation of speech quality, which is mainly due to insufficient smoothness of the modified spectra between frames, and ineffective spectral modification. This paper presents a new spectral modification method to improve the quality of modified speech. If frames are processed independently, discontinuous features may be generated. Therefore, a speech analysis technique called temporal decomposition (TD), which decomposes speech into event targets and event functions, is used to model the spectral evolution effectively. Instead of modifying the speech spectra frame by frame, we only need to modify event targets and event functions. This feature leads to easy modification of the speech spectra, and the smoothness of modified speech is ensured by the shape of event functions. To improve spectral modification, we explore Gaussian mixture model parameters (spectral-GMM parameters) to model the spectral envelope of each event target, and develop a new algorithm for modifying spectral-GMM parameters in accordance with formant scaling factors. We first evaluate the effectiveness of our proposed method in spectra modeling, and then apply it to two areas which require different amounts of spectral modification, emotional speech synthesis and voice gender conversion. Experimental results show that the effectiveness of our proposed method is verified for spectra modeling and spectral modification.

View full abstract

Download PDF (146K)
High speed, high resolution ultrasonic linear motor using V-shape two bolt-clamped Langevin-type transducers

Kazumasa Asumi, Ryouichi Fukunaga, Takeshi Fujimura, Minoru Kuribayash ...

2009Volume 30Issue 3 Pages 180-186
Published: May 01, 2009
Released on J-STAGE: May 01, 2009

DOIhttps://doi.org/10.1250/ast.30.180

JOURNAL FREE ACCESS

Show abstractHide abstract

An ultrasonic motor using two bolt-clamped Langevin-type transducers was described. A rigorous optimization of the motor’s structure was conducted and its results are reported in regard to various motor parameters. Based on FEM analysis and experimental results it was established that symmetric and anti-symmetric resonance frequencies could be matched by adjusting the mass of the tip of the motor’s head block. The driving voltage of the motor was reduced by using stacked multi-layered piezo-elements. The velocity of the motor fabricated in this study was more than 1.5 m/s and 25 N in a condition. However, a velocity of less than 100 mm/s could not be achieved using conventional resonance driving. In the case of a velocity lower than 1 mm/s, driving was achieved by “inertial driving.” 1.5 nm resolution was observed using DC driving.

View full abstract

Download PDF (343K)
Efficient array design algorithm for wide-band application of the MUltiple SIgnal Classification algorithm

Ahmed Desoki, Jun-ichi Takada, Ichiro Hagiwara

2009Volume 30Issue 3 Pages 187-198
Published: May 01, 2009
Released on J-STAGE: May 01, 2009

DOIhttps://doi.org/10.1250/ast.30.187

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper analyzes the error in MUSIC results due to the effect of finite precision arithmetic. Thus, relation of this error to sources correlation level and array and sources configuration parameters is clearly identified. As a result efficient array design algorithm suitable for acoustic environments is derived. This algorithm is efficient in the sense that it can determine minimum number of sensors. This algorithm is quite general as it includes the effect of all parameters such as number of sources, sources correlation level, maximum resolution, maximum source angle, number of sensors, sensor spacing and arithmetic precision. Also this algorithm is shown to be seamlessly applicable in realistic environments where many additional effects and sources of error often exist. During this paper it is shown that this algorithm is indispensable for DOA estimation in wide-band and reverberant environments.

View full abstract

Download PDF (252K)
Computational acoustic vision by solving phase ambiguity confusion

Ryuichi Shimoyama, Ken Yamazaki

2009Volume 30Issue 3 Pages 199-208
Published: May 01, 2009
Released on J-STAGE: May 01, 2009

DOIhttps://doi.org/10.1250/ast.30.199

JOURNAL FREE ACCESS

Show abstractHide abstract

Computational acoustic vision by solving phase ambiguity confusion (CAVSPAC) is proposed for two-dimensional colorful imaging such as pointillisme in the broadband sound environment. The 2D distributions of equivalent point sources were identified as an image from the cross-power spectral phases of sound pressure measured by two pairs of microphones. Each point source was assigned a color corresponding to its frequency. Multiple source locations are introduced from one cross-spectral phase value because of “phase ambiguity” at high frequencies, when the microphone interval is wider than the sound wavelengths. The true source location was extracted from multiple source locations as being the frequency independent. The broadband noise source was visualized with a single two-way loudspeaker set at various positions in the reverberative room. Using CAVSPAC, the 2D image could be identified for the broadband sound source from all directions spherically, except in the area just beside, above and under the microphones. The moderate wider microphone interval than the sound wavelengths led to a better resolution at the source image.

View full abstract

Download PDF (884K)
Temporal patterns of auditory signals for electric consumer products: Comparison of judgments by young and older adults in four countries

Kenji Kurakata, Tazu Mizunami, Daryle J. Gardner-Bonneau, Se Jin Park, ...

2009Volume 30Issue 3 Pages 209-215
Published: May 01, 2009
Released on J-STAGE: May 01, 2009

DOIhttps://doi.org/10.1250/ast.30.209

JOURNAL FREE ACCESS

Show abstractHide abstract

Auditory signals are often used in human-machine interfaces of electric consumer products to inform the user of the state of operation. The signals are expected to enhance the usability of products, especially for older adults who are not accustomed to using such products. Kurakata et al. [Acoust. Sci. & Tech., 29, 176–184 (2008)] reported experimental results related to temporal patterns of auditory signals for electric home appliances on which a Japanese Industrial Standard (JIS S 0013:2002) was based. However, all participants in their experiment were residents of Japan. Therefore, it remains unclear whether the information that the auditory-signal patterns convey can be understood unambiguously by people in other countries who have different cultural backgrounds and who use products that have different interface designs from those sold on the Japanese market. This paper presents results of an experiment in which American, German, and Korean listeners participated and evaluated auditory signals, employing a similar procedure to that of the study by Kurakata et al. By comparing their judgments to those by Japanese listeners, internationally acceptable temporal patterns of auditory signals are proposed.

View full abstract

Download PDF (199K)

ACOUSTICAL LETTERS

Comparison of sound localization performance between virtual and real three-dimensional immersive sound field

Dae-Gee Kang, Yukio Iwaya, Ryota Miyauchi, Yôiti Suzuki

2009Volume 30Issue 3 Pages 216-219
Published: May 01, 2009
Released on J-STAGE: May 01, 2009

DOIhttps://doi.org/10.1250/ast.30.216

JOURNAL FREE ACCESS

Download PDF (136K)
Ultrasonic distance and velocity measurement by low-calculation-cost Doppler-shift compensation and high-resolution Doppler velocity estimation with wide measurement range

Shinnosuke Hirata, Minoru Kuribayashi Kurosawa, Takashi Katagiri

2009Volume 30Issue 3 Pages 220-223
Published: May 01, 2009
Released on J-STAGE: May 01, 2009

DOIhttps://doi.org/10.1250/ast.30.220

JOURNAL FREE ACCESS

Download PDF (1152K)

Register with J-STAGE for free!