Host: The Japanese Society for Artificial Intelligence
Name : The 37th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 37
Location : [in Japanese]
Date : June 06, 2023 - June 09, 2023
In recent years, commnication through avatars has become popular and been expected to apply applications. However, operating the avatar can be burdensome as it requires not only speech but also the use of face, head, and hand motions simultaneously. To reduce the burden on the operator, we propose Speech2motion, a model that automatically generates CG avatar motion from speech. In this work, we focus on the motions in conversation, and the Speech2motion model uses LSTM-based neural networks to predict head motion and facial animation. We recorded 70 minites of motion data along with the speech of one speaker during conversation. We then trained the Speech2motion model using the recorded data. Experimental evaluation shows our proposed model achieves a mean opinion score (MOS) of 3.07 in naturalness of generating the motions.