Acoustical Science and Technology

PAPERS

Perceptual simultaneity range as a function of frequency separation and frequency range for two tones

Satoshi Okazaki, Makoto Ichikawa, Minoru Tsuzaki

2019Volume 40Issue 6 Pages 367-373
Published: November 01, 2019
Released on J-STAGE: November 01, 2019

DOIhttps://doi.org/10.1250/ast.40.367

JOURNAL FREE ACCESS

Show abstractHide abstract

When two tones start with a small onset asynchrony, one might perceive them as starting simultaneously. The range of this perceptual synchrony is defined as a perceptual simultaneity range (PSR), within which two tones are perceived as simultaneous. Our earlier study found V-shaped behavior of the PSR as a function of frequency separation (Δ f) for two tones. For this study, we investigated the PSR as a function of frequency separation (Δ f) and frequency range (F1). The PSR was measured using the simultaneity judgment task. Results demonstrated that the PSR decreases steeply and then increases gradually along the Δ f axis. Its breakpoint appeared at circa 0.5 critical bandwidth (CB), irrespective of F1. The PSR-Δ f curves for different F1 were almost coincident for Δ f ≲ 0.5 CB. The coincident decrease of the PSR for small Δ f supports the notion that the cochlear interference affects the perception of simultaneity.

View full abstract

Download PDF (770K)
An encoder-decoder based grapheme-to-phoneme converter for Bangla speech synthesis

Arif Ahmad, Mohammad Reza Selim, Muhammed Zafar Iqbal, Mohammad Shahid ...

2019Volume 40Issue 6 Pages 374-381
Published: November 01, 2019
Released on J-STAGE: November 01, 2019

DOIhttps://doi.org/10.1250/ast.40.374

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes an encoder-decoder based sequence-to-sequence model for Grapheme-to-Phoneme (G2P) conversion in Bangla (Exonym: Bengali). G2P models are key components in speech recognition and speech synthesis systems as they describe how words are pronounced. Traditional, rule-based models do not perform well in unseen contexts. We propose to adopt a neural machine translation (NMT) model to solve the G2P problem. We used gated recurrent units (GRU) recurrent neural network (RNN) to build our model. In contrast to joint-sequence based G2P models, our encoder-decoder based model has the flexibility of not requiring explicit grapheme-to-phoneme alignment which are not straight forward to perform. We trained our model on a pronunciation dictionary of (approximately) 135,000 entries and obtained a word error rate (WER) of 12.49% which is a significant improvement from the existing rule-based and machine-learning based Bangla G2P models.

View full abstract

Download PDF (2052K)

TECHNICAL REPORTS

Continuous auditory feedback on the status of adaptive cruise control, lane deviation, and time headway: An acceptable support for truck drivers?

Pavlo Bazilinskyy, Pontus Larsson, Emma Johansson, Joost C. F. de Wint ...

2019Volume 40Issue 6 Pages 382-390
Published: November 01, 2019
Released on J-STAGE: November 01, 2019

DOIhttps://doi.org/10.1250/ast.40.382

JOURNAL FREE ACCESS

Show abstractHide abstract

The number of trucks that are equipped with driver assistance systems is increasing. These driver assistance systems typically offer binary auditory warnings or notifications upon lane departure, close headway, or automation (de)activation. Such binary sounds may annoy the driver if presented frequently. Truck drivers are well accustomed to the sound of the engine and wind in the cabin. Based on the premise that continuous sounds are more natural than binary warnings, we propose continuous auditory feedback on the status of adaptive cruise control, lane offset, and headway, which blends with the engine and wind sounds that are already present in the cabin. An on-road study with 23 truck drivers was performed, where participants were presented with the additional sounds in isolation from each other and in combination. Results showed that the sounds were easy to understand and that the lane-offset sound was regarded as somewhat useful. Systems with feedback on the status of adaptive cruise control and headway were seen as not useful. Participants overall preferred a silent cabin and expressed displeasure with the idea of being presented with extra sounds on a continuous basis. Suggestions are provided for designing less intrusive continuous auditory feedback.

View full abstract

Download PDF (1232K)
Identification of jet aircraft model based on frequency characteristics of noise by convolutional neural network

Makoto Morinaga, Junichi Mori, Takanori Matsui, Yasuaki Kawase, Kazuyu ...

2019Volume 40Issue 6 Pages 391-398
Published: November 01, 2019
Released on J-STAGE: November 01, 2019

DOIhttps://doi.org/10.1250/ast.40.391

JOURNAL FREE ACCESS

Show abstractHide abstract

We have been developing an aircraft model identification system that uses a convolutional neural network (CNN). The assumption is that this identification system would be used to estimate the number of flights to create noise maps. In our previous study, we used the CNN model to classify five aircraft comprising three rotorcraft, one turboprop, and one jet aircraft, and the accuracy reached 99%. In the present study, to examine whether this method is also effective for identifying the sound sources of jet aircraft, we conducted two case studies using frequency characteristics of aircraft noise obtained from field measurements around Osaka International Airport and Narita International Airport. Targeting 7 and 18 types of sound source at Osaka and Narita, respectively, an identification rate of 98% was obtained in both cases. This suggests that the present system can estimate the number of jet aircraft flights for each engine type or each aircraft model with very high accuracy.

View full abstract

Download PDF (337K)

ACOUSTICAL LETTERS

Acoustic analysis on the speech of a cochlear implant patient: A case study of post-lingual deafness

Risa Shimizu, Yuko Sakaguchi, Rion Iwasaki, Takayuki Arai, Moriyuki Ma ...

2019Volume 40Issue 6 Pages 399-401
Published: November 01, 2019
Released on J-STAGE: November 01, 2019

DOIhttps://doi.org/10.1250/ast.40.399

JOURNAL FREE ACCESS

Download PDF (470K)
Error calculation in cross-correlation based population estimation technique of fish and mammals

Shaik Asif Hossain, Monir Hossen

2019Volume 40Issue 6 Pages 402-405
Published: November 01, 2019
Released on J-STAGE: November 01, 2019

DOIhttps://doi.org/10.1250/ast.40.402

JOURNAL FREE ACCESS

Download PDF (359K)
Multi-condition training for noise-robust speech emotion recognition

Yuya Chiba, Takashi Nose, Akinori Ito

2019Volume 40Issue 6 Pages 406-409
Published: November 01, 2019
Released on J-STAGE: November 01, 2019

DOIhttps://doi.org/10.1250/ast.40.406

JOURNAL FREE ACCESS

Download PDF (310K)
Japanese children's audiovisual emotion perception and its relation to their sensitivity to pitch-accentual pattern

Hisako W. Yamamoto, Misako Kawahara, Akihiro Tanaka

2019Volume 40Issue 6 Pages 410-412
Published: November 01, 2019
Released on J-STAGE: November 01, 2019

DOIhttps://doi.org/10.1250/ast.40.410

JOURNAL FREE ACCESS

Download PDF (148K)

Register with J-STAGE for free!