Efficient Hardware Accelerator for Compressed Sparse Deep Neural Network

Hao XIAO; Kaikai ZHAO; Guangzhu LIU

doi:10.1587/transinf.2020EDL8153

Abstract

This work presents a DNN accelerator architecture specifically designed for performing efficient inference on compressed and sparse DNN models. Leveraging the data sparsity, a runtime processing scheme is proposed to deal with the encoded weights and activations directly in the compressed domain without decompressing. Furthermore, a new data flow is proposed to facilitate the reusage of input activations across the fully-connected (FC) layers. The proposed design is implemented and verified using the Xilinx Virtex-7 FPGA. Experimental results show it achieves 1.99×, 1.95× faster and 20.38×, 3.04× more energy efficient than CPU and mGPU platforms, respectively, running AlexNet.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!