The x-ray microbeam system is a unique instrument to measure the movements of multiple flesh-points of speech articulators, and the recently updated system at the University of Wisconsin provides the most reliable means among available all methods. This review describes the summary of the system, the author's experience and recent studies by several researchers. Recent progress in particular is the development of the speech production database of English and Japanese speakers, which elucidates articulatory variability, speaker characteristics, and dynamic behavior of soft-tissue speech organs based on the comparison of a large number of subjects.
Speech is produced by articulating speech organs such as the jaw, tongue, and lips. We have developed an articulatory-based speech synthesis model that converts a phoneme string into a continuous acoustic signal by mimicking human speech production process. This paper describes a computational model of the speech production process which involves a motor process to generate articulatory movements from the motor task sequence and an articulatory-to-acoustic mapping to determine the vocal-tract acoustic characteristics. A method for recovering articulatory parameters from speech acoustics is also described within a framework of articulatory-based speech analysis and synthesis.
A very high-quality speech analysis-modification-synthesis method is introduced as a versatile tool for speechand auditory perception research. The proposed method is based on a simple and appealingconcept of the channel VOCODER which was proposed 60 years ago. Interference due to the signal periodicity is carefully removedfrom the extracted spectral envelope using a pitch adaptivespline smoothing method. This virtually perfect removal of periodicity interference enablesflexible manipulations of speech parameters in a fairly wide range. Examples illustrate how the proposed method works andwhere it is applicable to promote perception research.
Studying infant speech perception is important and valuable to elucidate the process of language acquisition. In the past decades, there have been many studies on the perception of segmental features related to phonemic contrasts, as well as the perception of supra-segmental features related to the prosodic pattern of the utterances. For these studies, several types of methods have been developed to measure infant speech perception abilities, and researchers choose an appropriate method depending on the purpose of the study. In this paper, the behavioral measures that are most frequently used in infant speech-perception studies are reviewed. These procedures are the high-amplitude sucking procedure, the operant headturning procedure, and the headturn preference procedure.
Magnetoencephalography (MEG) measures weak magnetic fields generated by neuronal activities of the human brain during various information processing. It enables reseachers to estimate which part of the brain is active at which phase of information processing with high temporal and spatial resolutions without any bio-hazards. Neuromagnetic approaches to phonetics are providing new insights on neuronal mechanisms including bottom-up and top-down processes responsible for phonetic categorization and spoken language processing, based on which some of the controversial issues in conventional phonetics may be resolved in the future.
The recent development of non-invasive functional brain imaging techniques, such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), etc., enables us to visualize brain activities (or cerebral blood flow: CBF) during linguistic and cognitive tasks. Functional brain imaging studies on comprehension and generation of spoken and written words, including phonological processing, were reviewed. Despite the short histories of PET and fMRI, the number of studies in this field are markedly increasing. Collaboration with researchers in linguistics (including phonetics and phonology), cognitive psychology, etc. will be needed for further development of research on functional brain imaging.
In this paper, I show that pitch patterns influence native speakers of Korean in their perception of voiced and voiceless sounds in Japanese. It is shown that speakers of Korean, a language which does not have the contrast between voiced and voiceless sounds, rely on the difference of the pitch patterns rather than the voicing sounds. More specifically: 1. It was confirmed that, roughly speaking, higher pitch occurs in association with voiceless sounds, while lower pitch occurs in association with voiced sounds. 2. It was confirmed that Korean speakers have little problem perceiving voiceless sounds when they are accompanied by a clear high pitch, and voiced sounds when they are accompanied by a clear low pitch. In other cases, the rate of perception errors was high. 3. In a follow-up experiment, the pitch pattern of voiced sounds was substituted with that of their voiceless counterparts and that of voiceless sounds with that of their voiced counterparts. The results showed that Korean speakers are apt to perceive voiceless sounds as voiced when the accompanying pitch is low, while they tend to perceive voiced sounds as voiceless when the accompanying pitch is high. Japanese speakers showed the same tendency, but with a lower rate of errors in perception.
This book consists of two separate studies written by two authors under the titles of "Universality and Individuality of Phonological Structures" and "Aspects of Phonological Processes and Prosodic Structure". The book takes up some phonological issues of Japanese and English and examines the differences and the similarities between the languages. It suggests that Japanese and English have much in common at their abstract level and that the same constraints are at work behind them. This book seeks the universality of languages and provides an excellent introduction to Optimality Theory.