1W8R 20T SRAM Codebook for 20% Energy Reduction in Mixed-precision Deep-learning Inference Processor System

Ryotaro Ohara; Kabuto Masaya; Atsushi Fukunaga; Masakazu Taichi; Yuto Yasuda; Riku Hamabe; Shintaro Izumi; Hiroshi Kawaguchi

doi:10.2197/ipsjtsldm.18.19

Ryotaro Ohara, Kabuto Masaya, Atsushi Fukunaga, Masakazu Taichi, Yuto Yasuda, Riku Hamabe, Shintaro Izumi, Hiroshi Kawaguchi

Author information

Keywords: SRAM, Deep neural network, Codebook

JOURNAL FREE ACCESS

2025 Volume 18 Pages 19-27

DOI https://doi.org/10.2197/ipsjtsldm.18.19

Details

Abstract

This study proposes a novel one-write eight-read (1W8R) 20T multiport static random-access memory (SRAM) for codebook quantization in deep-learning processors. We manufactured the memory using a 40nm process and achieved a memory read-access time of 2.75ns and a power consumption of 2.7pJ/byte. Furthermore, we estimated the performance of an embedded super-multiport SRAM in the pipeline of a deep-learning processor. We employed NVDLA, NVIDIA's deep learning processor, as the motif and simulated it based on the power obtained from an actual proposed memory. We estimated the power consumption when inputting a 4,094× 2,048 (4K) image into the target model, which is a U-Net semantic segmentation model. The obtained power and area reduction results were 20.24% and 26.24%, respectively.

Corresponding author

Register with J-STAGE for free!