The aim of this paper is to clarify the relationships among the phrase final tones of questioning utterances, the pragmatic factor, and the linguistic factor. We extracted questioning utterances from 54 conversations by 25 speakers and classified them according to the degree of information request and the sentence final particles. Tones were classified into the five tones based on its acoustic features. The main findings of the study are shown as below. The relationships between the rising tone and the pragmatic factor are coordinating. The relationships between the rising tone and the linguistic factor are complementary.
Language is unique to humans and no equivalent cognitive/communication system has been found in other animals, but many similarities have been reported, such as in bird song “syntax” or monkey call “semantics.” Phonology is considered a unique area of the human faculty of language, but recent studies on both monkeys and birds have shown similarities, and further claimed a “new” precursor for the faculty of language. Many linguistic components, including phonology, could be precursors, pre-existing abilities in nonhuman animals, and acquisition of these precursors could act as a pre-adaptation for the emergence of human language.
A set of hypotheses regarding the evolutionary emergence of human speech is proposed. Animal acoustical communication probably originated from noises contingent with respiratory gestures. These noises and orofacial movements were gradually ritualized, forming fixed action patterns for communication. Combinations of calls were used by young animals to induce parental behavior. This effect was utilized by male animals to attract females. Extremes of such vocalizations are songs, used for mate attraction and territorial defense in many species. Songs are an honest signal of vigor. Songs and behavioral contexts were gradually associated through a mutual segmentation process and proto-words emerged in ancestral hominids.
Budgerigars are small parrots and one of the most interesting behaviors in this species is vocal mimicry. This article briefly reviews studies of vocal behavior and the central nervous vocal control system of budgerigars. Other interesting characteristics of this species are behavioral contagion or imitation, and rhythmic synchronization to metronomic stimuli. All of these behaviors are involved in sensory-motor coordination and/or transforming sensory inputs to motor outputs. The neural and psychological substrates of these behaviors may have a link to one another and could give us insight into the biological basis and evolution of vocal learning and communication.
Preverbal vocalizations in human infants drastically change during their first year. Previous studies have classified preverbal vocalizations into several stages based on the acoustical characteristic and the age. Vocal learning is the acquisition of new vocal patterns based on social and environmental experiences, and some animals are reported to exhibit vocal learning. In this paper, we view language as sequential vocal patterns and will discuss the developmental change in preverbal vocalizations in human infants from an evolutionary point of view.
Mothers use some segments selectively when talking to infants and young children. We examined why particular segments are favored over others by examining college-aged Japanese adults' ratings of how good a nonsense word sounds as an item of infant-directed vocabulary (IDV). Adults' ratings were highly consistent with mothers' actual use of segments in IDV, as well as the predictions of Jakobson's principle of maximal contrast. They suggest that Japanese adults possess an intuitive sense of what a good IDV should sound like, which is a part of underlying linguistic knowledge of Japanese phonology rather than learned from children.
Hauser, Chomsky and Fitch (2002) claim that the presence/absence of recursion divides the organisation of the language faculty into two parts: FLB and FLN. The origins of properties in FLB are thought to be shared by non-human species, whereas FLN contains the species-specific operation Merge, which applies repeatedly to syntactic objects to generate recursively-structured expressions. However, this paper claims that Merge applies not only to (morpho)syntactic objects but also to phonological primitives that make up the phonological structure of morphemes. Accordingly, phonological categories are engaged not only in the externalization of internally-constructed expressions but also in internal computation.
We consider here several properties of phonological stress systems, including the midpoint pathology, an unattested pattern in which stress is confined to a word-medial syllable in short words but reverts to an edge-based window in longer words. Previous attempts have been made to rule out midpoint systems by eliminating the phonological constraints that yield them, or by alluding to difficulties in learning them. We suggest that a preference for representing word edges in memory and limits on subitization—evolutionarily older “fossil” abilities which are neither specific to humans nor to language—are sufficient to rule out the midpoint pathology. We take the same approach to motivate accentual window size and some left-right asymmetries observed in the typology of attested stress systems. This approach highlights the relevance of descent in accounting for human cognition, as well as the benefits that evolutionary thinking can bring to the study of language.
The metaphor of evolution is commonly invoked in phonological theory, particularly with regard to the notion of markedness. However, Blevins' Evolutionary Phonology denies markedness, attributing change to parses of perceptual ambiguity. The current paper seeks to enhance the power of the Evolutionary model by investigating an additional type of ambiguity based on the linearity problem in speech perception. It is shown how divergent parses of CV transitions, encoded as the Vocalic Onset node in the representations of the Onset Prominence framework, account for cross-linguistic differences in the evolution of seemingly unrelated phonetic properties.
The study on language origins and evolution had been considered a pipe dream for more than a century. Even in the 1990s, there were external and internal barriers to academic research on this topic, which caused some serious clashes between linguists and biologists. However, the Merge-only hypothesis of human language evolution has opened a door to resolve the clashes and establish scientific methods for modern evolutionary linguistics. This article introduces such a new scenario and offers its empirical evidence from phonology. Specifically, I argue that the “Third Factor” played a key role in the shift from Merge and the SM interface in proto-language to human phonology. This view makes explicit the reason why we can explore language origins, the hardest problem in science, from languages at hand, and helps us establish specific methods for exploring it in phonology. Empirical tests for this view involve the typology of palatal phonotactics in Japanese and English.
The level of phonetic realization of lexical accent (reduced level, which, in the case of words with an accent kernel, is exhibited by a low F0 value, or full level) is one of the major determinants of sentence intonation in Tokyo Japanese. Three experiments were conducted to provide evidence to support the author's previous claim that the reduction of accent level is triggered by semantically restrictive modification, and not by syntax. In Experiment 1, F0 values of the first two constituents of [[A[BC]]D] and [A[[BC]D]] were compared. The results indicated the irrelevance of overall syntactic branching structure for the accent level. In Experiments 2 and 3, peak F0 values of nouns with an accent kernel in semantically restrictive and non-restrictive conditions were investigated acoustically and perceptually. The results supported the prediction: nouns have lower F0 values when they are semantically restricted by the directly preceding modifier than when they are not.
The C/D model (Fujimura 1992, 2007) is an explicit framework that calculates the continuous physical gestures and other quantitative phonetic information of speech sounds from inputs that solely consist of qualitative phonological information. Fujimura proposes that the inputs to the C/D model are syllables, themselves consisting of sets of unary and underspecified phonological features, instead of a set of binary features used in many standard phonological theories. This paper slightly revises this original formulation by Fujimura and instead proposes that syllables can be further defined in terms of “mora sets”, in addition to unary features that define the qualitative characteristics of the Impulse Response Functions (IRFs). The argument is developed based on discussion of vowel devoicing in Japanese.