Perception of size modulated vowel sequence: Can we normalize the size of continuously changing vocal tract?
Minoru TsuzakiChihiro TakeshimaToshio Irino
2009 Volume 30 Issue 2 Pages 83-88


Changes in vocal tract size vary the formant frequencies, even when the shape of vocal tracts is the same and the spoken vowels are categorized to be the same. Several studies have demonstrated that the normalization of vocal tract size can be achieved in a bottom-up manner. To investigate how fast this process works, the identification of vowel sequences was examined under conditions where the size was sinusoidally modulated with several frequencies (0.24–62.50 Hz). The performance level changed slightly, but significantly depending on the modulation frequency, and the dependence was not monotonic. The performance dropped for modulation around 4 Hz. The nonmonotonic function could not be predicted by a simple assumption of usage of a single size-estimator that requires a certain processing time. Mismatches were prominent for high frequencies: a deterioration was predicted because of the limited processing time, while the actual performance showed a recovery. This indicates that a switching of the process mode for modulation occurs at around 4 Hz. Below 4 Hz, the auditory system can successfully normalize the size change. Above 4 Hz, the auditory system segregates the sounds using the size cue and the recognition of each vowel is not critically affected.

