2P1-G03 Segmenting Sound Signals and Articulatory Movement using Recurrent Neural Network toward Phoneme Acquisition

Hisashi KANDA; Tetsuya OGATA; Kazunori KOMATANI; Hiroshi G. OKUNO

doi:10.1299/jsmermd.2008._2P1-G03_1

Abstract

This paper proposes a computational model for phoneme acquisition by infants. Infants perceive speech not as discrete phoneme sequences but as continuous acoustic signals. One of critical problems in phoneme acquisition is the design for segmenting these continuous speech. The key idea to solve this problem is that articulatory mechanisms such as the vocal tract help human beings to perceive sound units corresponding to phonemes. To segment acoustic signal with articulatory movement, our system was implemented by using a physical vocal tract model, called the Maeda model, and applying a segmenting method using Recurrent Neural Network with Parametric Bias (RNNPB). This method determines segmentation boundaries in a sequence using the prediction error of the RNNPB model, and the PB values obtained by the method can be encoded as kind of phonemes. Experimental results demonstrated that our system could self-organize the same phonemes in different continuous sounds. This suggests that our model reflects the process of phoneme acquisition.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!