Acoustical Science and Technology

PAPERS

Speech recognition based on statistical models including multiple phonetic decision trees

Sayaka Shiota, Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Le ...

2011 Volume 32 Issue 6 Pages 236-243
Published: November 01, 2011
Released on J-STAGE: November 01, 2011

DOIhttps://doi.org/10.1250/ast.32.236

JOURNAL FREE ACCESS

Show abstractHide abstract

We propose a speech recognition technique using multiple model structures. In the use of context-dependent models, decision-tree-based context clustering is applied to find an appropriate parameter tying structure. However, context clustering is usually performed on the basis of unreliable statistics of hidden Markov model (HMM) state sequences because the estimation of reliable state sequences requires an appropriate model structures, that cannot be obtained prior to context clustering. Therefore, context clustering and the estimation of state sequences essentially cannot be performed independently. To overcome this problem, we propose an optimization technique of state sequences based on an annealing process using multiple decision trees. In this technique, a new likelihood function is defined in order to treat multiple model structures, and the deterministic annealing expectation maximization algorithm is used as the training algorithm. Experimental continuous phoneme recognition results show that the proposed method of using only two decision trees achieved about an 11.1% relative error reduction over the conventional method.

View full abstract

Download PDF (746K)
Evaluating small end-fire loudspeaker array under various reverberations

Yoichi Haneda, Ken’ichi Furuya, Akitoshi Kataoka

2011 Volume 32 Issue 6 Pages 244-250
Published: November 01, 2011
Released on J-STAGE: November 01, 2011

DOIhttps://doi.org/10.1250/ast.32.244

JOURNAL FREE ACCESS

Show abstractHide abstract

A compact end-fire loudspeaker array based on the linearly constrained minimum variance has been investigated in various reverberation rooms. In the linearly constrained minimum variance beamformer, the coherent matrix that determines the suppression direction of the radiation sound is important. Therefore, we evaluated the directivities of a loudspeaker array for four different coherent matrices that were calculated using the following methods in which the array will not radiate the sound: (I) for any direction except the reproduction direction, (II) in the direction of several measured room transfer functions, (III) in the direction of several ideal (free field) room transfer functions, and (IV) in the direction of the weighted ideal room transfer functions. To evaluate performance, a 4-element line array of 28-mm-diameter loudspeakers with equal spacing of 48 mm was implemented. The array used loudspeaker units without enclosures to shorten the spaces between the units. The directivities of the prototype loudspeaker array were measured in an anechoic room and three reverberant rooms by using the above four methods. The prototype arrays achieved higher directivity of 6 to 15 dB for wideband frequencies (300 Hz to 3.4 kHz) than the single loudspeaker for all methods. Moreover, the experimental results clarified that the array with method (IV), which did not use prior measured room transfer functions, had efficient performance in comparison with the array with method (II), especially when the reverberation time was 120 ms or less.

View full abstract

Download PDF (1977K)
An active free-field equalizer for headphones used in functional magnetic resonance imaging

Daniel Menzel, Hugo Fastl, Thomas Brandt, Thomas Stephan

2011 Volume 32 Issue 6 Pages 251-254
Published: November 01, 2011
Released on J-STAGE: November 01, 2011

DOIhttps://doi.org/10.1250/ast.32.251

JOURNAL FREE ACCESS

Show abstractHide abstract

The free-field response of NordicNeuroLab AudioSystem electrostatic headphones for use in functional magnetic resonance imaging (fMRI) is determined by loudness comparisons with free-field equalized Beyerdynamic DT 48 headphones. Based on these measurements, an active equalizer with resonances at 0.55, 1.5, and 6.9 kHz is developed, realized, and tested. A free-field equivalent level independent of frequency within ±3 dB between 63 Hz and 10 kHz is obtained when using the AudioSystem headphones with the described free-field equalizer.

View full abstract

Download PDF (355K)
Preliminary evaluation of underwater sound detection by the cephalopod statocyst using a forced oscillation model

Kenzo Kaifu, Tomonari Akamatsu, Susumu Segawa

2011 Volume 32 Issue 6 Pages 255-260
Published: November 01, 2011
Released on J-STAGE: November 01, 2011

DOIhttps://doi.org/10.1250/ast.32.255

JOURNAL FREE ACCESS

Show abstractHide abstract

To understand the mechanism of the peripheral auditory system of the cephalopod statocyst, the frequency dependence of particle motion sensitivity in cephalopods was estimated using a physical model of the sensory system, which was assumed to be forced oscillation. Reported perception thresholds of Sepia officinalis, Octopus vulgaris, and O. ocellatus fit the model well at low frequencies, whereas at frequencies above 150 Hz, the empirically measured threshold increased more steeply than the predicted increment. These results indicate that the frequency response of the perception threshold of cephalopods to particle motion can be primarily understood using the forced oscillation model, while unknown factor(s) play a role in the higher frequency range. Cephalopods are thought to be sensitive to low-frequency particle motion rather than high-frequency motion. The evolutionary function of cephalopod acoustical perception is not clear; however, the data suggest that they recognize the low-frequency particle motion that may be generated by prey, predators, and conspecifics.

View full abstract

Download PDF (258K)

ACOUSTICAL LETTERS

Pulse compression of linear chirp signals using a quadrature detector

Nur Mariah Khairah binti Abdul Aziz, Hirokazu Kurabayashi, Naohiko Tan ...

2011 Volume 32 Issue 6 Pages 261-263
Published: November 01, 2011
Released on J-STAGE: November 01, 2011

DOIhttps://doi.org/10.1250/ast.32.261

JOURNAL FREE ACCESS

Download PDF (352K)
Effects of pause duration and speech rate on sentence intelligibility in younger and older adult listeners

Akihiro Tanaka, Shuichi Sakamoto, Yôiti Suzuki

2011 Volume 32 Issue 6 Pages 264-267
Published: November 01, 2011
Released on J-STAGE: November 01, 2011

DOIhttps://doi.org/10.1250/ast.32.264

JOURNAL FREE ACCESS

Download PDF (197K)
A study on the precedence effect under background sound

Takahiro Fujikawa, Shigeaki Aoki

2011 Volume 32 Issue 6 Pages 268-270
Published: November 01, 2011
Released on J-STAGE: November 01, 2011

DOIhttps://doi.org/10.1250/ast.32.268

JOURNAL FREE ACCESS

Download PDF (464K)
Detection of the second harmonics of Lamb waves in fatigued magnesium plates

Makoto Fukuda, Kazuhiko Imano, Hideki Yamagishi, Katsuhiro Sasaki

2011 Volume 32 Issue 6 Pages 271-275
Published: November 01, 2011
Released on J-STAGE: November 01, 2011

DOIhttps://doi.org/10.1250/ast.32.271

JOURNAL FREE ACCESS

Download PDF (707K)

Register with J-STAGE for free!