Weakening of stop articulation in Japanese voiced plosives was analyzed using the phonetically annotated part of the Corpus of Spontaneous Japanese (CSJ). It turned out that the weakening of/b/ and /d/ into [ß] and [ð] could be best described as a function of TACA (time allotted for consonant articulation) as was the case in the affricate-fricative variation of Japanese /z/. The location of the voiced plosive phonemes in a linguistic unit showed secondary importance as the factor of the variation, but it was the location in a higher-level unit like accentual phrase or utterance that played a crucial role. The weakening of /g/ into [ϒ] or [η], on the other hand, is somewhat different in that it should be treated differently depending on whether the phoneme was immediately preceded by a moraic nasal /N/. When it was preceded by an /N/, the TACA-RSA (rate of stop articulation) relationship reached a plateau much earlier (at around 70%) than in /b/, /d/, and /g/ not preceded by an /N/ (where the RSA values reaches the level of 90%). It also turned out that the curve of TACA-RSA relationship changed systematically reflecting the complexity of phonological contrast at the point of articulation of the phoneme in question. The more complex the contrast is, the earlier the curve reaches a plateau. Statistical modeling by means of logistic regression analysis revealed it was possible to predict the variation with 68-76% accuracy (closed data) using only the TACA information. The accuracy reached 72-81% when TACA and all other linguistic and extra-linguistic variables were used.
This paper discusses a cross-linguistically uncommon vowel attested in Southern Ryukyuan, i.e. a type of unrounded central high vowel with a certain laminal modification, which Uemura (2000) labels as a "laminal vowel" based on his subjective observation. By providing new palatographic and linguographic data of the Miyako-Tarama variety of Southern Ryukyuan, I examine Uemura's description, and describe important phonetic detail of the "laminal vowel" and other relevant sounds. In order to explore an adequate phonetic interpretation of the "laminal vowel", I utilize a phonetic classificatory distinction of [flat] vs. [grooved] proposed by Pike (1971).
An experiment on impression rating of paralinguistic information was conducted using categories of paralinguistic information expressed by teachers during teaching as speech stimuli. The aim of this study was to determine whether there are differences in the impression ratings of listeners subjected to paralinguistic information conveyed in the spontaneous speech of teachers. The experimental design involved preparing six categories of paralinguistic information as speech stimuli, and recording subject responses to 45 pairs of assessment words. Variability among subjects with regard to their impressions of categories of paralinguistic information was determined to exist.
Glottal area function provides important information in clarifying physical mechanisms of vocal fold vibration and investigating voice qualities in a quantitative manner, whereas its estimation has been technically difficult. In this paper, estimation of glottal area function was conducted in vivo by stereo-endscopic measurement of the larynx combined with a high-speed digital imaging technique. In the present data from a female speaker, the high-speed camera captured images at an image resolution of 768 (horizontal)×352 (vertical) with a frame rate of 3750 fps for the sample duration of 10.12 s. Glottal length, width, and area of the female participant were estimated in three different fundamental frequencies (F0s).
This study employs MRI motion imaging to investigate vowel-consonant coarticulation during utterances with /t//c//d//z//k/, and /g/ in the Japanese syllabary. Image analyses were conducted for the variation in horizontal and vertical tongue position. Results show the followings. 1) Variation in tongue position and shape was larger for consonants than for vowels. 2) Articulatory positions for consonants varied depending on concentration of consonants at specific regions. 3) So-called velar consonants /k/ and /g/ in Japanese were realized as post-palatal consonants. 4) Tongue positions for vowels were higher after velar consonants, while they were lower after alveolar consonants. 5) Japanese five vowels were observed in two groups, anterior-mid (/i//e//u/) and posterior-low (/a//o/), with /u/ as a non-back vowel and /a/ as a back vowel.
This paper describes a mechanical talking robot that resembles human speech production aparatus and its control mechanism. The talking robot, Waseda Talker, has mechanical replicas of the vocal folds, tongue, jaw and lips, and it is capable of producing vowel and consonant sounds in a human-mimetic manner. The source sounds are produced by airflow from the lungs to the glottis, and the resonance characteristics of the vocal tract are controlled by changing geometries of the articulators. Each component organ in the talking robot has many degrees of freedom and requires a high-level control design to realize continuous speech. A new control method based on articulatory motion data obtained from electromagnetic articulography (EMA) is presented together with experimental results from continuous speech synthesis using the most recent model of the robot.