If ROBOT has the ability to communicate with human by flexible conversation, it becomes more human-like-being and the field of application will be considerably expanded.
In this paper, a speech input/output system to achieve these purpose, which is implemented as a conversational sub-system of WABOT-2 who is a key-board player, is described.
The system is composed of two parts: One is the conversational speech understanding part to recognize the continuously uttered sentences appropriately according to the situation, the other is the speech synthesis part to prompt the speech input to human naturally and to realize the smooth and natural conversation.
To realize the flexibility of the task, the phoneme-recognition-based-system and the synthesis-by-rule using CV-syllable are adopted in understanding part and synthesis part, respectively. In these cases, the change of the sounds according to the surrounding phonemic context, so called coarticu-lation, becomes main problem in understanding part, and the deterioration of the quality of the synthesized speech becomes that in synthesis part. In this study, the effective use of the articulatory feature is proposed to compensate coarticulation. As for the problem of synthesis part, the sensitivity of the k-parameter to the formant frequency is considered.
One more important problem to construct such a system is how to consider the topic of the conversation. In this system, a network model having path-weighting-facility and a weight-control-scheme based on the production rule are proposed for the sentence recognition. These method enable to consentrate the subset of the network representing the acceptable sentences according to the situation, and to understand uniquely even if the sentence is ambiguous one having multiple meanings.
These algorithms are implemented on the hardware with 15 microprocessors and are able to run in real time.
With these products, the conversational system with high flexibility is realized.
 View full abstract