We present methods for estimating the cross-sectional area function of the vocal tract from formant frequencies. They extend the work of Story (J. Acoust. Soc. Am., 119, 715–718, 1996) based on a sensitivity function representing the change in the formant frequency due to a perturbation of the cross-sectional area. In Method I, the area function is estimated through an iterative procedure that uses the sensitivity function as the basis function to optimize the area function that produces the target frequencies. In Method II, a mode function linearly expands the area function. The estimation is performed by optimizing the value of each mode coefficient, where the sensitivity function is used as a constraint in the optimization. As a specific feature, the summing weight of sensitivity functions in Method I and mode functions in Method II is determined by minimizing an objective function representing the frequency error of every formant. By using existing area function data for English vowels, we compare the performance of each method with respect to the estimation accuracy and convergence speed. The results show that our methods can effectively reduce degrees of freedom of the area function and quickly obtain the optimal solution with fair accuracy.
In this study, we propose a method of classifying speech under stress using parameters extracted from a physical model to characterize the behavior of the vocal folds. Although many conventional methods have been proposed, feature parameters are directly extracted from waveforms or spectrums of input speech. Parameters derived from the physical model can characterize stressed speech more precisely because they represent physical characteristics of the vocal folds. Therefore, we propose a method that fits a two-mass model to real speech in order to estimate the physical parameters that represent muscle tension in the vocal folds, vocal fold viscosity loss, and subglottal pressure coming from the lungs. Furthermore, combinations of these physical parameters are proposed as features effective for the classification of speech as either neutral or stressed. Experimental results show that our proposed features achieved better classification performance than conventional methods.
The simulation of acoustic streaming between a bending transducer and a reflector is discussed. Instead of full fluid analysis, the streaming is calculated from second-order approximated forces of acoustic streaming and static pressure originated by the nonlinear sound field. Sound field and fluid dynamics are simulated separately under finite-element harmonic and static analyses, respectively. Through two examples of streaming, the validity of the simulation method is verified. One is streaming excited between a disk vibrator and a reflector, and the other is streaming in an ultrasonic air pump. By comparing the calculated results with the measured ones in terms of the distribution of sound pressure and streaming, it is found that the present method can well simulate the streaming in the air layer.
Finite-Difference Time-Domain (FDTD) models are used to predict low-frequency sound fields in small volumes containing a limp panel formed from a porous material which partially or completely subdivides the volume. This porous panel is incorporated into FDTD using a Rayleigh model as proposed by Suzuki et al. However, to accurately reproduce the low-frequency sound field it is found necessary to introduce an additional Moving Frame Model (MFM) to account for motion of the porous panel. For spaces that are completely subdivided by a porous panel, the MFM accounts for a spring-mass-spring resonance that can occur below the lowest acoustic cavity mode. The MFM assumes lumped mass behavior of the porous panel which is coupled to the FDTD update equations that incorporate the Rayleigh model. FDTD is compared against measurements using transient excitation with a pulse input to a loudspeaker in a small reverberant room under three different conditions: (1) empty room, (2) with a mineral fibre panel partially dividing the room, and (3) with a mineral fibre panel completely dividing the room. Close agreement is obtained between experimental results and FDTD incorporating the MFM; this validates the models as well as implementation of the loudspeaker as a hard velocity source.