2025 Volume 91 Issue 3 Pages 425-430
The paper conducts empirical experiments in the continual pre-training method using both artificially generated images and real images, and proposes an approach to model parameter initialization in continual pre-training. For the synthetic pre-training dataset, we use the VisualAtom-1k from formula-driven supervised learning, and for the real-image pre-training dataset, we assign the publicly available ImageNet-1k. In our experiments, we employ the Vision Transformer as a recognition model. Compared to pre-training with a single dataset, significant performance improvements were observed with our proposed setting. Furthermore, by implementing conditional model parameter initialization during continual pre-training, additional performance enhancements can be confirmed.