The feature of the signal processing of auditory nervous system for mono-syllabic vowels has been investigated by means of an auditory electronic model. The model consists of pre-emphasis, basilar membrane, hair cells, primary and secondary neurons (Fig. 1). Each neuron has latral connection so as to have response area and inhibitory areas observed in physiological experiment (Figs. 10, 18). (1) The response of the neurons for vowels shows a pattern connecting several peaks. The characteristic frequency (CF') of the neurons located at the neurons located at the peaks of the pattern approximately corresponds with the formant frequency of vowels (Figs. 2, 3, 4). When the adjacent formant frequencies approach beyond the frequency resolution of the nervous system, the response changes from bimodal to unimodal (Figs. 4, 6, 7). (2) The rippled waveform pulsating with pitch period is transmitted in the neurons with much higher CF than pitch frequency (Figs, 4, 5, 8, 9, 10, 11, 12, 13). The reason is because inhibition is not effective for the AC component in response. Therefore, pitch-information strikingly appears in the secondary neurons in spite of narrowing frequency characteristic band due to lateral inhibition. It is very different property from ordinary frequency analyzer (Fig. 14). (3) Response of the neurons is produced as the result of the mutual inhibitory action between formant components. In this case, the inhibition due to the lower formant frequency component works effectively, owing to the unsymmetry of inhibitory area of the neurons and the decrease of the higher frequency component in speech sounds (Figs. 19, 20). (4) Two kinds of the frequency emphasis were evaluated by comparison as preprocessing. Pre-emphasis (I) was provided so as to get approximately uniform output in hair cells for speech sounds in order to make up for the insufficient characteristic of intensity range (Fig. 16, PE (I)), and Pre-emphasis (II) was set so as to get equal output of nervous system for input signal whose frequency characteristic is similar to the equal loudness curve of 30 phons (Fig. 16, PE (II)). In case of the Pre-emphasis II, the inhibition due to the lower frequency component decreases (Fig. 17), and in the response to vowels, the difference between /u/ and /o/ is noticed clearly (Fig. 20). In addition, the response to pitch component decreases or does not appear (Fig. 20). Considering from the result mentioned above and (2), it is suggested that the pitch-information is processed rather as temporal information than spatial information. (5) It is verified that the vocal sounds of five vowel are characterized by peak positions of the response pattern, from the investigation into the variation of the response to ten speakers.
抄録全体を表示