Abstract
This paper presents a summary of the virtual data generation for complex industrial activity recognition Challenge, which focused on exploring virtual data generation techniques to improve the performance of human activity recognition (HAR) in complex industrial environments. The challenge utilized the OpenPack dataset, a large-scale multimodal collection of sensor data captured during real-world packaging operations. Participants were tasked with generating synthetic accelerometer data to augment a baseline HAR model. Four teams from different countries proposed diverse approaches, including interpolation, classical augmentations, variational autoencoders, and GAN-based methods. Their submissions were evaluated using micro F1 score across multiple random seeds to test robustness. The results reveal that while deep generative models offer strong potential, simpler signal-based techniques also perform competitively when wellaligned with the data structure. Additionally, incorporating finer-grained action labels within each operation can help guide more realistic virtual data generation, leading to improved HAR model performance by better capturing intra-operation dynamics. Based on these findings, we discuss key insights and suggest future directions for designing robust, semantically consistent, and computationally efficient virtual data generation pipelines for industrial HAR applications.