Automated feature map padding and transfer circuit for CNN inference

Hongying Zhang; Ming Chen; Mao Ni; Lan Chen; Yiheng Zhang; Xiaoran Hao

doi:10.1587/elex.21.20240559

LETTER

Automated feature map padding and transfer circuit for CNN inference

Hongying Zhang, Ming Chen, Mao Ni, Lan Chen, Yiheng Zhang, Xiaoran Hao

Author information

Keywords: hardware acceleration, CNN, feature map, FPGA

JOURNAL FREE ACCESS

2024 Volume 21 Issue 22 Pages 20240559

DOI https://doi.org/10.1587/elex.21.20240559

Browse “Advance online publication” version

Details

Abstract

This paper introduces a novel hardware acceleration circuit designed to address the storage address offset issue in Convolutional Neural Networks (CNNs) during the feature map padding process. Traditional CPU-based padding and data transfer methods are computationally intensive and lead to high latency and power consumption, especially on edge devices. Our solution automates and integrates feature map padding and transfer. This significantly reduces DRAM access and improves the speed of transferring feature maps between DRAM and on-chip SRAM. The proposed circuit, tested on the ZCU102 development board using YOLOv4-tiny’s convolutional layers, demonstrates a speedup of over 20 times compared to CPU-based methods and more than 4 times compared to CPU with DMA.

Corresponding author

Register with J-STAGE for free!