Acoustical Science and Technology

PAPERS

Probabilistic spectral envelope modeling of musical instruments within the non-negative matrix factorization framework for mixed music analysis

Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki

2014Volume 35Issue 4 Pages 181-191
Published: April 01, 2014
Released on J-STAGE: July 01, 2014

DOIhttps://doi.org/10.1250/ast.35.181

JOURNAL FREE ACCESS

Show abstractHide abstract

Non-negative matrix factorization (NMF) has been one of the most useful techniques for musical signal analysis in recent years. In particular, supervised NMF, in which a large number of instrumental samples are used for the analysis, is garnering much attention with respect to analytical accuracy and speed. The accuracy, however, deteriorates if the system does not have enough samples. Therefore, in principle, such methods require as many samples as possible in order for the analysis to be accurate. In this paper, we propose an analysis method that 1) does not require the collection of a large number of training samples, and 2) combines the NMF and probabilistic approaches. In this approach, it is assumed that each instrumental category has a model-invariant feature, called a probabilistic spectral envelope (PSE). As an extension of a spectral envelope, this feature represents the probabilities of spectral envelopes belonging to the instrumental category in a two-dimensional (frequency-amplitude) space. The analysis of an input musical signal is carried out using a supervised NMF framework, where the basis matrix contains the optimum spectra that have been generated from pretrained PSEs.

View full abstract

Download PDF (2005K)
Influence of flattening of head-related transfer functions in low-frequency region on sound localization

Kanji Watanabe, Ryosuke Kodama, Sojun Sato, Shouichi Takane, Koji Abe

2014Volume 35Issue 4 Pages 192-200
Published: April 01, 2014
Released on J-STAGE: July 01, 2014

DOIhttps://doi.org/10.1250/ast.35.192

JOURNAL FREE ACCESS

Show abstractHide abstract

It is clear that applications such as virtual auditory displays can be achieved by synthesizing head-related transfer functions with high accuracy. However, in practice, their detailed spectral shapes over the entire frequency range are not likely to be essential to sound localization. For example, low-frequency sound has a tendency to diffract around the head and torso of the listener, which leads to low-frequency characteristics of head-related transfer functions (HRTFs) being largely independent of the position of the sound source. This may imply that HRTFs involve few localization cues in the low-frequency region. We have sought to clarify whether removing spectral cues from HRTFs affects horizontal localization [K. Watanabe et al., Acoust. Sci. & Tech., 32(3), 121–124 (2011)]. In this paper, the low-frequency characteristics of the HRTFs of both ears were flattened below a certain frequency, termed the ``boundary frequency,'' to investigate the influence of the low-frequency component of HRTFs. These flattening procedures were simultaneously applied to the HRTFs of both ears while the interaural level and time differences of the original HRTFs were retained. A localization test using such partially flattened HRTFs was carried out, and the results showed that the flattening of the HRTFs did not significantly affect sound localization at boundary frequencies of 0.5–2 kHz, except for the source direction of 60°. Those boundary frequencies differed depending on source direction. The low-frequency region below these boundary frequencies may be ignored, and some data reduction can be applied here without significant influence on sound localization.

View full abstract

Download PDF (1434K)
Iterative method to estimate muscle activation with a physiological articulatory model

Xiyu Wu, Jianwu Dang, Ian Stavness

2014Volume 35Issue 4 Pages 201-212
Published: April 01, 2014
Released on J-STAGE: July 01, 2014

DOIhttps://doi.org/10.1250/ast.35.201

JOURNAL FREE ACCESS

Show abstractHide abstract

Computational modeling of the speech organs is able to improve our understanding of human speech motor control. In order to investigate muscle activation in speech motor control, we have developed an automatic estimation method based on a 3D physiological articulatory model. In this method, the articulatory target was defined by the entire posture of the tongue and jaw in the midsagittal plane, which was reduced to a six-dimensional space by principal component analysis (PCA). In the PCA space, the distance between an articulatory target and the model was gradually minimized by automatically adjusting muscle activations. The adjustment of muscle activations was guided by a dynamic PCA workspace that was used to predict individual muscle functions in a given position. This dynamic PCA workspace was estimated on the basis of an interpolation of eight reference PCA workspaces. The proposed method was assessed by estimating muscle activations for five Japanese vowel postures that were extracted from magnetic resonance images. The results showed that the proposed method can generate muscle activation patterns that can control the model to realize given articulatory targets. In addition, the estimated muscle activation patterns were consistent with anatomical knowledge and previously reported measurement data.

View full abstract

Download PDF (1440K)

TECHNICAL REPORT

Experimental study on hearing thresholds for low-frequency pure tones

Shinichi Sakamoto, Sakae Yokoyama, Hiroo Yano, Hideki Tachibana

2014Volume 35Issue 4 Pages 213-218
Published: April 01, 2014
Released on J-STAGE: July 01, 2014

DOIhttps://doi.org/10.1250/ast.35.213

JOURNAL FREE ACCESS

Show abstractHide abstract

To investigate the effect of wind turbine noise, a study project has been conducted in the three years from fiscal year 2010 under the sponsorship of the Ministry of the Environment, Japan. One of the key aims in this study was to examine the effects of low-frequency components contained in wind turbine noise, and a series of auditory experiments have been conducted using an experimental facility composed of two adjacent rooms, where low-frequency sounds down to infrasound frequency range could be produced. As the first experiment using this facility, human hearing thresholds for pure tones were examined in the frequency range from 10 Hz to 200 Hz with 97 participants in a wide age range from 20 to 60 years.

View full abstract

Download PDF (1317K)

SHORT NOTE

Reduction of ultrasound transmissivity through a bone-phantom plate resulting from shear wave excitation

Osamu Saito

2014Volume 35Issue 4 Pages 219-222
Published: April 01, 2014
Released on J-STAGE: July 01, 2014

DOIhttps://doi.org/10.1250/ast.35.219

JOURNAL FREE ACCESS

Download PDF (657K)

Register with J-STAGE for free!