An FPGA Accelerator for Vision Transformer with Quantization and LUT-Based Operations

Cheng XU; Yirong KAN; Renyuan ZHANG; Yasuhiko NAKASHIMA

doi:10.1587/transinf.2025PAP0003

Abstract

This paper proposes a Field-Programmable Gate Array (FPGA) accelerator for Vision Transformers (ViTs) with quantization and look-up-table (LUT) based operations. First, two improved quantization methods are proposed, achieving comparable performance at lower bit-widths. Furthermore, linear and nonlinear units' designs are proposed to support diverse operations in ViTs models. Finally, the LUT-based accelerator design is implemented and evaluated. Experimental results on the ImageNet dataset demonstrate that our proposed quantization method achieves an accuracy of 80.74% at 2-bit width, outperforming state-of-the-art Vision Transformer quantization methods by 0.1% to 0.5%. The performance of the proposed FPGA accelerator demonstrates a higher energy efficiency, achieving a peak energy efficiency of 7.06 FPS/W and 246 GOPS/W.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!