The barn owl is a nocturnal predator with excellent sound localization ability. Due to the asymmetric ears of this bird, the interaural time and level differences, respectively, provide information for the horizontal and vertical direction of a sound source. Forty years of behavioral, anatomical and physiological research on the owl's auditory system have revealed that these two acoustic cues are computed in parallel and hierarchical neural pathways, which converge at the midbrain to form an auditory space map. This neural representation of the acoustic world, calibrated with the visual system, underlies the highly precise sound localization behavior of the barn owl.
When a piano string is struck by a hammer, it begins to vibrate vertically, i.e., in a plane perpendicular to the soundboard. After the vertical vibration has begun, the string then begins to rotate owing to horizontal vibration. The rotation direction changes several seconds later, which suggests that the frequencies of the vertical and horizontal components of the vibration are slightly different. In this article, we describe the modeling and theoretical analysis of the two-dimensional motion of a piano string. To this end, the string and soundboard are represented by an equivalent mechanical circuit. The string with two-dimensional movement is divided into two independent strings, each with one-dimensional movement. The vertical and horizontal motions are initialized to have the same frequency and are connected by a bridge that is represented as an ideal transformer. A soundboard is attached to the vertically vibrating string. Once the circuit is excited, the two strings vibrate at slightly different frequencies.
Acoustic features of fricatives (/s/ and /ɕ/) and affricates (/ʦ/ and /ʨ/) spoken by a female native speaker of Japanese were investigated. Discriminant analysis in the time domain revealed that fricatives (/s/ and /ɕ/) and affricates (/ʦ/ and /ʨ/) are well separated at a discriminant ratio of 98.0% (n = 508) when using the variables of the rise duration and the sum of steady and decay durations of the consonants' intensity envelope. Discriminant analysis in the spectral domain revealed that alveolar consonants (/s/ and /ʦ/) and alveolo-palatal consonants (/ɕ/ and /ʨ/) are well separated at a discriminant ratio of 99.2% (n = 508) when using the variable of the mean intensity of one-third-octave bandpass filter with a center frequency of 3,150 Hz. In addition, the four consonants were correctly identified at an accuracy rate of 97.2% (n = 508) when using a combination of production boundaries obtained in the above two discriminant analyses. Results suggest that the acoustic features of the four consonants can be represented by the time- and spectral-domain variables described above.
We introduce a new optimized microphone-array processing method for a spoken-dialogue robot in noisy and reverberant environments. The method is based on frequency-domain blind signal extraction, a signal separation algorithm that exploits the sparseness of a speech signal to separate the target speech and diffuse background noise from the sound mixture captured by a microphone array. This algorithm is combined with multichannel Wiener filtering so that it can effectively suppress both background noise and reverberation, given a priori information of room reverberation time. In this paper, first, we develop an automatic optimization scheme based on the assessment of musical noise via higher-order statistics and acoustic model likelihood. Next, to maintain the optimum performance of the system, we propose the multimodal switching scheme using the distance information provided by robot's image sensor and the estimation of SNR condition. Experimental evaluations have been conducted to confirm the efficacy of this method.
Acoustical differences between normal and cross fingerings of the shakuhachi with five tone holes are investigated on the basis of the pressure standing wave along the bore and the input admittance. Cross fingerings in the shakuhachi often yield pitch sharpening in the second register, which is contrary to our conventional understanding of pitch flattening by cross fingerings and is called intonation anomaly. It is essential to identify and discriminate the input admittance spectra between the upper and lower bores on the basis of the standing-wave patterns. Spectrum (or mode) switching between both types of bores is a clue to the cause of the intonation anomaly. This is illustrated by considering stepwise shifts of tone holes while keeping the hole-to-hole distances fixed and by comparing the resulting switches in input admittance spectra. When spectrum switching occurs, docking of the upper and lower bores makes up a higher resonance mode throughout the whole bore and then leads to the intonation anomaly. This spectrum switching on the cross fingering is generalized as the diabatic transition (the Landau–Zener effect) in physics.
A previous study indicated that superdirectivity is achieved with a microphone-array system consisting of seven microphones and a neural network. In this work, we aim to improve the directivity by optimizing the parameters of the neural network using a genetic algorithm. A computational simulation shows that the optimized system can produce superdirectivity with a half-width of five degrees. The optimized system improves the directivity because its side-lobes are suppressed by more than 10 dB compared with the results of the previous study. This suppression is also observed in terms of harmonic distortions. Moreover, in an examination using AM and FM waves as input signals, the optimized system achieves higher performances than those in the previous study.