主催: The Japanese Society for Artificial Intelligence
会議名: 2023年度人工知能学会全国大会(第37回)
回次: 37
開催地: 熊本城ホール+オンライン
開催日: 2023/06/06 - 2023/06/09
Perceiver is a deep learning model that can be applied to a variety of modalities. It can simultaneously process various forms of input and output, such as images, speech, and natural language using the same architecture. However, Perceiver is computationally more expensive than other models. Therefore, training the model in environments with relatively limited fast parallel computational resources is relatively difficult. In this study, we aimed to reduce the computational cost such that learning can be performed in a short time in environments other than large-scale computing systems. To this end, we first show that a speed-up method proposed for Transformer is also effective for Perceiver. In particular, the gated attention unit proposed for FLASH reduces computational complexity without sacrificing accuracy. The proposed acceleration method can achieve accuracy comparable to that of the original model in a limited computing environment. As an introductory example, we conducted experiments using the ImageNet image recognition task and demonstrated that the proposed method can reduce the training time compared to conventional methods without a significant loss of accuracy. This model can be used to input and output any kind of data quickly in a low-cost computing environment.