2023 Volume 89 Issue 1 Pages 99-104
Since the introduction of the Vision Transformer, many transformer-based networks have been proposed. Nakashima et al. showed that ViT and gMLP can be pre-trained in FractalDB and achieve the same level of accuracy as ImageNet-1k. We hypothesize that other Transformer networks may also benefit from pre-training in FractalDB. If this hypothesis is proven, it can be expected that improving FDSL-based datasets such as FractalDB will improve the accuracy of existing networks and those to be proposed in the future. Therefore, in this paper, we perform exhaustive experiments on pre-training results of representative Transformer networks on FractalDB.