In an attempt to construct a large-scale database of spontaneous speech, the authors planned to give segmental and prosodic labels to spontaneous Japanese speech. This paper reports the method of this labeling and its performance. First, the performance of automatic segmental labeling by Hidden Markov Model was verified. Sample speech of about four hour long was automatically phoneme labeled and compared to the results of hand-labeling. It turned out that average of label boundary difference with hand labeled data was 14.3[ms]. Second, the performance of prosodic labeling by newly proposed labeling scheme named X-JToBI (eXtended J_ToBI) was verified. The analysis of labeled data showed that newly added inventories appeared in the data of spontaneous speech and rate of inter-labeler agreement increased in nearly all types of labels.
Medical Ultrasonography (US) can be useful for analysing movements inside the mouth during speech. CT and MRI scans may provide more detailed information of the anatomical gestures, but the lightness, speed, portability, and safety of US offer advantages to the phonetician. This paper describes the techniques and data of US scans with respect to phonetics research, and describes the gestures that are associated with the phonemes of Japanese speech. In addition, we propose the use of image-processing techniques for deriving a laryngeal elevation index (LEI) from the US data, to provide a contour which can be used to reveal prosodic information.
The aim of this paper is to investigate the occurrence of vowel devoicing in both Korean and Japanese spoken by Korean learners of Japanese. Close vowels between voiceless consonants in Korean and Japanese were analyzed acoustically. The remarkable character of Korean vowel devoicing is that the occurrence rate of devoicing differs considerably among individual subjects, irrespective of dialect. As for Japanese spoken by Korean learners, there is a high correlation between occurrence rates of devoicing in Korean and in Japanese.
This paper proposes a method of comprehensively and structurally characterizing the segmental aspect of speech. After building phoneme HMMs from speech samples, distances between any two states of any phoneme HMMs were calculated to make a state-based distance matrix. With Ward's method of hierarchical clustering, a tree diagram of the distance matrix was generated to visualize comprehensive structure embedded in the pronunciation. Using American English (AE) and Japanese English (JE) speech samples, two kinds of trees were drawn. Comparison between the two trees clearly showed the well-known Japanese habits in speaking English. Using the distance matrix of JE, the compatibility between phonetic structure in the pronunciation of JE and lexical structure of the entire vocabulary of AE was estimated based upon Cohort model of isolated word perception.
This paper is concerned with the tone of loanwords in Kagoshima dialect. Kagoshima dialect has a two-pattern tone system. Type-A is a falling type tone, and Type-B is a rising type tone. Most of loanwords are pronounced Type-A in Kagoshima, but for the young people, the number of Type-B loanwords has been increasing. This suggests that the young people in Kagoshima are affected by the Tokyo accent which contains many level tone loanwords.
Metalinguistic Awareness of syllabic morae is the ability to recognize a syllabic mora (a nasal coda, a geminate stop consonant, and a long vowel in this study) as one mora. It has been said to play a significant role for the acquisition of Japanese Kana letters, especially for syllabic morae. However the process of the acquisition is yet to be made clear. The goal of this study is to therefore consider how such metalinguistic awareness of syllabic morae is developed in young Japanese children, through the comparison between children living in Yokohama where morae are counted and Fukaura where morae are not. Two small studies were conducted to examine whether they use mora or syllable in segmenting words with special morae. The results show that the children living in Fukaura recognize syllabic morae less than those in Yokohama, especially for long vowels and nonsense words. It suggests that input is significant for young children to develop their metalinguistic awareness of syllabic morae.