2024 年 26 巻 4 号 p. 431-442
Sign Language Generation (SLG) is one of the emerging fields in Sign Language (SL) communication technologies. SLG employs a Sequence to Sequence (Seq2Seq) model with SL glosses serving as intermediaries. While using Seq2Seq is an optimal way of generating SL, it overlooks the fineness and smoothness of the inbetween transitions for SL glosses. In this work, we address the issues of fine motion prediction and smooth transition generation to seamlessly integrate glosses for natural SL expression in SLG termed as Sign Language Inbetweening (SLI). In SLI, we propose the Body Parts Integration (BPI) for fine motion prediction and the Variational Autoencoder-Generative Adversarial Network (VAE-GAN) generator with a progressive embedding for smooth transition generation over variational transition lengths. Evaluation of the proposed BPI-VAE-GAN model is conducted on a Motion Capture dataset of Japanese Sign Language performed by a native signer. Assessment employs Mean Angular Error (MAE) for fine motion prediction and mean Jerk for smooth transition generation. Results indicate that the proposed architecture significantly reduces MAE, outperforming the GAN by 46% reduction and considerably outperforms the linear interpolation. Additionally, Jerk analysis illustrates the system’s capacity to generate smoother motion sequences compared to GAN.