In automatic voice response systems where a large number of words are inserted into fixed sentences, such as in voice-guided car navigation systems, one of the most important problems is the adjustment of the fundamental frequency (
F0) contour of the inserted word to suit the
F0 context of the fixed sentence. In Mandarin Chinese, it is required that the effects of tone and intonation on
F0 contours be represented separately. We proposed a scheme to solve the problem in terms of a word-level
F0 range (
WF0R) and a set of relative
F0 change fields.
WF0R in any position of a sentence is a tone-independent general
F0 range to represent the intonational effect; whereas
F0 change field (
F0CF) is an
F0 range that accounts for the result of both the tone combination of words and the intonation. Relative
F0CF is regulated in reference to
WF0R and represents tonal effect on
F0. In this paper, we statistically examine the invariance of the relative
F0CF with various speakers’ speech data. From an analysis of four native speakers’ utterances of 160 disyllabic words in the initial, middle and final parts of three carrier sentences, which were recorded on 2 or 3 days, it is found that: (1) Chinese speakers read words in the same sentence position with stable relative
F0CFs, even on different days; and (2) the relative
F0CFs in the middle position of a sentence are generally the same as those in the initial position but slightly different from those in the final position.
View full abstract