人工知能学会論文誌
Online ISSN : 1346-8030
Print ISSN : 1346-0714
ISSN-L : 1346-0714
原著論文
Bi-Directional LSTM Networkを用いた発話に伴うジェスチャの自動生成手法
金子 直史竹内 健太長谷川 大白川 真一佐久田 博司鷲見 和彦
著者情報
ジャーナル フリー

2019 年 34 巻 6 号 p. C-J41_1-12

詳細
抄録

We present a novel framework for automatic speech-driven natural gesture motion generation. The proposed method consists of two steps. First, based on Bi-Directional LSTM Network, our deep network learns speech-gesture relationships with both forward and backward consistencies for a long period of time. The network regresses full 3D skeletal pose of a human from perceptual features extracted from the input audio in each time step. Second, we apply combined temporal filters to smooth out generated pose sequences. We utilize a speech-gesture dataset recorded with a headset and a marker-based motion capture to train our network. We evaluate different acoustic features, network architectures, and temporal filters in order to validate the effectiveness of the proposed approach. We also conduct a subjective evaluation and compare our approach against real human gestures. The subjective evaluation result shows that our generated gestures are comparable to “original” human gestures and are significantly better than “mismatched” human gestures taken from a different utterance in the scale of naturalness.

著者関連情報
© 人工知能学会 2019
前の記事
feedback
Top