We propose a speech recognition technique using multiple model structures. In the use of context-dependent models, decision-tree-based context clustering is applied to find an appropriate parameter tying structure. However, context clustering is usually performed on the basis of unreliable statistics of hidden Markov model (HMM) state sequences because the estimation of reliable state sequences requires an appropriate model structures, that cannot be obtained prior to context clustering. Therefore, context clustering and the estimation of state sequences essentially cannot be performed independently. To overcome this problem, we propose an optimization technique of state sequences based on an annealing process using multiple decision trees. In this technique, a new likelihood function is defined in order to treat multiple model structures, and the deterministic annealing expectation maximization algorithm is used as the training algorithm. Experimental continuous phoneme recognition results show that the proposed method of using only two decision trees achieved about an 11.1% relative error reduction over the conventional method.
A compact end-fire loudspeaker array based on the linearly constrained minimum variance has been investigated in various reverberation rooms. In the linearly constrained minimum variance beamformer, the coherent matrix that determines the suppression direction of the radiation sound is important. Therefore, we evaluated the directivities of a loudspeaker array for four different coherent matrices that were calculated using the following methods in which the array will not radiate the sound: (I) for any direction except the reproduction direction, (II) in the direction of several measured room transfer functions, (III) in the direction of several ideal (free field) room transfer functions, and (IV) in the direction of the weighted ideal room transfer functions. To evaluate performance, a 4-element line array of 28-mm-diameter loudspeakers with equal spacing of 48 mm was implemented. The array used loudspeaker units without enclosures to shorten the spaces between the units. The directivities of the prototype loudspeaker array were measured in an anechoic room and three reverberant rooms by using the above four methods. The prototype arrays achieved higher directivity of 6 to 15 dB for wideband frequencies (300 Hz to 3.4 kHz) than the single loudspeaker for all methods. Moreover, the experimental results clarified that the array with method (IV), which did not use prior measured room transfer functions, had efficient performance in comparison with the array with method (II), especially when the reverberation time was 120 ms or less.
The free-field response of NordicNeuroLab AudioSystem electrostatic headphones for use in functional magnetic resonance imaging (fMRI) is determined by loudness comparisons with free-field equalized Beyerdynamic DT 48 headphones. Based on these measurements, an active equalizer with resonances at 0.55, 1.5, and 6.9 kHz is developed, realized, and tested. A free-field equivalent level independent of frequency within ±3 dB between 63 Hz and 10 kHz is obtained when using the AudioSystem headphones with the described free-field equalizer.
To understand the mechanism of the peripheral auditory system of the cephalopod statocyst, the frequency dependence of particle motion sensitivity in cephalopods was estimated using a physical model of the sensory system, which was assumed to be forced oscillation. Reported perception thresholds of Sepia officinalis, Octopus vulgaris, and O. ocellatus fit the model well at low frequencies, whereas at frequencies above 150 Hz, the empirically measured threshold increased more steeply than the predicted increment. These results indicate that the frequency response of the perception threshold of cephalopods to particle motion can be primarily understood using the forced oscillation model, while unknown factor(s) play a role in the higher frequency range. Cephalopods are thought to be sensitive to low-frequency particle motion rather than high-frequency motion. The evolutionary function of cephalopod acoustical perception is not clear; however, the data suggest that they recognize the low-frequency particle motion that may be generated by prey, predators, and conspecifics.