Acoustical Science and Technology
Online ISSN : 1347-5177
Print ISSN : 1346-3969
ISSN-L : 0369-4232
Volume 40, Issue 6
Displaying 1-8 of 8 articles from this issue
PAPERS
  • Satoshi Okazaki, Makoto Ichikawa, Minoru Tsuzaki
    2019 Volume 40 Issue 6 Pages 367-373
    Published: November 01, 2019
    Released on J-STAGE: November 01, 2019
    JOURNAL FREE ACCESS

    When two tones start with a small onset asynchrony, one might perceive them as starting simultaneously. The range of this perceptual synchrony is defined as a perceptual simultaneity range (PSR), within which two tones are perceived as simultaneous. Our earlier study found V-shaped behavior of the PSR as a function of frequency separation (Δ f) for two tones. For this study, we investigated the PSR as a function of frequency separation (Δ f) and frequency range (F1). The PSR was measured using the simultaneity judgment task. Results demonstrated that the PSR decreases steeply and then increases gradually along the Δ f axis. Its breakpoint appeared at circa 0.5 critical bandwidth (CB), irrespective of F1. The PSR-Δ f curves for different F1 were almost coincident for Δ f ≲ 0.5 CB. The coincident decrease of the PSR for small Δ f supports the notion that the cochlear interference affects the perception of simultaneity.

    Download PDF (770K)
  • Arif Ahmad, Mohammad Reza Selim, Muhammed Zafar Iqbal, Mohammad Shahid ...
    2019 Volume 40 Issue 6 Pages 374-381
    Published: November 01, 2019
    Released on J-STAGE: November 01, 2019
    JOURNAL FREE ACCESS

    This paper proposes an encoder-decoder based sequence-to-sequence model for Grapheme-to-Phoneme (G2P) conversion in Bangla (Exonym: Bengali). G2P models are key components in speech recognition and speech synthesis systems as they describe how words are pronounced. Traditional, rule-based models do not perform well in unseen contexts. We propose to adopt a neural machine translation (NMT) model to solve the G2P problem. We used gated recurrent units (GRU) recurrent neural network (RNN) to build our model. In contrast to joint-sequence based G2P models, our encoder-decoder based model has the flexibility of not requiring explicit grapheme-to-phoneme alignment which are not straight forward to perform. We trained our model on a pronunciation dictionary of (approximately) 135,000 entries and obtained a word error rate (WER) of 12.49% which is a significant improvement from the existing rule-based and machine-learning based Bangla G2P models.

    Download PDF (2052K)
TECHNICAL REPORTS
  • Pavlo Bazilinskyy, Pontus Larsson, Emma Johansson, Joost C. F. de Wint ...
    2019 Volume 40 Issue 6 Pages 382-390
    Published: November 01, 2019
    Released on J-STAGE: November 01, 2019
    JOURNAL FREE ACCESS

    The number of trucks that are equipped with driver assistance systems is increasing. These driver assistance systems typically offer binary auditory warnings or notifications upon lane departure, close headway, or automation (de)activation. Such binary sounds may annoy the driver if presented frequently. Truck drivers are well accustomed to the sound of the engine and wind in the cabin. Based on the premise that continuous sounds are more natural than binary warnings, we propose continuous auditory feedback on the status of adaptive cruise control, lane offset, and headway, which blends with the engine and wind sounds that are already present in the cabin. An on-road study with 23 truck drivers was performed, where participants were presented with the additional sounds in isolation from each other and in combination. Results showed that the sounds were easy to understand and that the lane-offset sound was regarded as somewhat useful. Systems with feedback on the status of adaptive cruise control and headway were seen as not useful. Participants overall preferred a silent cabin and expressed displeasure with the idea of being presented with extra sounds on a continuous basis. Suggestions are provided for designing less intrusive continuous auditory feedback.

    Download PDF (1232K)
  • Makoto Morinaga, Junichi Mori, Takanori Matsui, Yasuaki Kawase, Kazuyu ...
    2019 Volume 40 Issue 6 Pages 391-398
    Published: November 01, 2019
    Released on J-STAGE: November 01, 2019
    JOURNAL FREE ACCESS

    We have been developing an aircraft model identification system that uses a convolutional neural network (CNN). The assumption is that this identification system would be used to estimate the number of flights to create noise maps. In our previous study, we used the CNN model to classify five aircraft comprising three rotorcraft, one turboprop, and one jet aircraft, and the accuracy reached 99%. In the present study, to examine whether this method is also effective for identifying the sound sources of jet aircraft, we conducted two case studies using frequency characteristics of aircraft noise obtained from field measurements around Osaka International Airport and Narita International Airport. Targeting 7 and 18 types of sound source at Osaka and Narita, respectively, an identification rate of 98% was obtained in both cases. This suggests that the present system can estimate the number of jet aircraft flights for each engine type or each aircraft model with very high accuracy.

    Download PDF (337K)
ACOUSTICAL LETTERS
feedback
Top