IPSJ Transactions on System and LSI Design Methodology
Online ISSN : 1882-6687
ISSN-L : 1882-6687
 
A Case Study for Improving Performances of Deep-Learning Processor with MRAM
Ryotaro OharaAtsushi FukunagaMasakazu TaichiMasaya KabutoRiku HamabeMasato IkegawaShintaro IzumiHiroshi Kawaguchi
Author information
JOURNAL FREE ACCESS

2024 Volume 17 Pages 7-15

Details
Abstract

We investigated the improvement achieved in the performance of a deep-learning inference processor by changing its cache memory from SRAM to spin-orbit torque magnetoresistive random-access memory (SOT-MRAM). The implementation of SOT-MRAM doubled the capacity in the same area compared to SRAM. It is also expected to reduce the main memory transfer without changing the chip area, thereby reducing the energy. As a case study, we simulated how much the performance could be improved by replacing SRAM with MRAM in a deep learning processor. The NVIDIA deep-learning accelerator (NVDLA) was used as a motif processor, and SegNet and U-Net were used as the target networks for the segmentation task. The image size was set to 512 × 1024 pixels. We evaluated the performance of the NVDLA with a 512-KB buffer and cache memory sizes of 1, 2, 4, and 8MB for its on-chip memory, replacing these two memories with MRAM implementations. As a result, when both the buffer and cache were replaced with SOT-MRAM, the energy consumption and speed could be reduced by 18.6% and 17.9%, respectively. In addition, the performance per unit area was improved by more than 36.4%. Replacing SRAM with spin-transfer torque MRAM is not suitable for inference devices, because the latency is significantly worse as a result of its slow write operation.

Content from these authors
© 2024 by the Information Processing Society of Japan
Previous article Next article
feedback
Top