Host: The Japanese Society for Artificial Intelligence
Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 38
Location : [in Japanese]
Date : May 28, 2024 - May 31, 2024
In the generation task of multi-dimensional time series data such as human behavior, the spatial loss function like L1 loss is used for training the generative model. If the diffusion probabilistic model is applied to generate time series data, the model generates the data by iterative denoising. The meaningful slight vibration in the data is considered to be denoised. In this study, we propose the loss function which includes spatial and frequency information for training the diffusion model. In the proposed loss function, the original and generated data are projected onto the frequency domain, and the coherence between these frequencies is calculated. We apply the proposed loss function to train the diffusion model for the generation of human motion during dyadic conversation. The result suggests that Frechet Inception Distance is improved by using the frequency property.