Abstract
In order to build the system which enables voice training interactively between the user and system, it is required to output the singing voice with rich sound as a teacher sound. Since the user listen to his own bone-transmitted sound as well as his air- transmitted sound when he utters, it is ideal to mix and output the bone-transmitted sounds with the air transmitted sounds of the teacher sound at the same mixing rate where the user actually listens to his singing voice. Here we present the result of investigation what rate of the air- transmitted and bone-transmitted sounds matches the natural situation of listening in utterance.