Journal of the Robotics Society of Japan
Online ISSN : 1884-7145
Print ISSN : 0289-1824
ISSN-L : 0289-1824
Paper
Phoneme Acquisition based on Vowel Imitation Model using Recurrent Neural Network and Physical Vocal Tract Model
Hisashi KandaTetsuya OgataToru TakahashiKazunori KomataniHiroshi G. Okuno
Author information
JOURNAL FREE ACCESS

2011 Volume 27 Issue 7 Pages 802-813

Details
Abstract
This paper proposes a continuous vowel imitation system that explains the process of phoneme acquisition by infants from the dynamical systems perspective. Almost existing models concerning this process dealt with discrete phoneme sequences. Human infants, however, have no knowledge of phoneme innately. They perceive speech sounds as continuous acoustic signals. The imitation target of this study is continuous acoustic signals including unknown numbers and kinds of phonemes. The key ideas of the model are (1) the use of a physical vocal tract model called the Maeda model for embodying the motor theory of speech perception, (2) the use of a dynamical system called the Recurrent Neural Network with Parametric Bias (RNNPB) trained with both dynamics of the acoustic signals and articulatory movements of the Maeda model, and (3) the segmenting method of a temporal sequence using the prediction error of the RNNPB model. The experiments of our model demonstrated following results: (a) the self-organization of the vowel structure into attractors of RNNPB model, (b) the improvement of vowel imitation using movement of the Maeda model, and (c) the generation of clear vowels based on the bubbling process trained with a few random utterances. These results suggest that our model reflects the process of phoneme acquisition.
Content from these authors
© 2011 The Robotics Society of Japan
Previous article Next article
feedback
Top