IPSJ Transactions on System and LSI Design Methodology

Message from the Editor-in-Chief

Tohru Ishihara

Article type: Editorial
Subject area: Editorial
2024 Volume 17 Pages 1
Published: 2024
Released on J-STAGE: February 28, 2024

DOIhttps://doi.org/10.2197/ipsjtsldm.17.1

JOURNAL FREE ACCESS

Download PDF (32K)
Slashing IC Power and Democratizing IC Access for the Digital Age

Tadahiro Kuroda

Article type: Invited Paper
2024 Volume 17 Pages 2-6
Published: 2024
Released on J-STAGE: February 28, 2024

DOIhttps://doi.org/10.2197/ipsjtsldm.17.2

JOURNAL FREE ACCESS

Show abstractHide abstract

The continuous growth of the semiconductor industry, driven by the use of AI to fuse the physical with virtual space, requires drastic improvement in IC power efficiency, memory capacity, and memory bandwidth. This paper describes two solutions to slash IC power using 3D integration and specialized chips. In addition, it proposes a novel sliced bread memory stacking scheme that enables more than a 10-fold increase in the number of memory chips and hence capacity per stack, as well as memory bandwidth. Furthermore, it elaborates on an agile development platform that enables designing chips like writing software and prototyping chips in days. The ease of use of the platform and its 10-fold reduction of development time and costs are expected to democratize access to specialized chips to accelerate innovation by increasing the number of developers. It will also accelerate the transition of society to the digital age. Finally, the author will discuss the need for the global IC industry to move away from its reliance on competition to co-existence and co-evolution to sustain its growth.

View full abstract

Download PDF (765K)
A Case Study for Improving Performances of Deep-Learning Processor with MRAM

Ryotaro Ohara, Atsushi Fukunaga, Masakazu Taichi, Masaya Kabuto, Riku ...

Article type: Regular Paper
Subject area: System LSI Design Methodology
2024 Volume 17 Pages 7-15
Published: 2024
Released on J-STAGE: February 28, 2024

DOIhttps://doi.org/10.2197/ipsjtsldm.17.7

JOURNAL FREE ACCESS

Show abstractHide abstract

We investigated the improvement achieved in the performance of a deep-learning inference processor by changing its cache memory from SRAM to spin-orbit torque magnetoresistive random-access memory (SOT-MRAM). The implementation of SOT-MRAM doubled the capacity in the same area compared to SRAM. It is also expected to reduce the main memory transfer without changing the chip area, thereby reducing the energy. As a case study, we simulated how much the performance could be improved by replacing SRAM with MRAM in a deep learning processor. The NVIDIA deep-learning accelerator (NVDLA) was used as a motif processor, and SegNet and U-Net were used as the target networks for the segmentation task. The image size was set to 512 × 1024 pixels. We evaluated the performance of the NVDLA with a 512-KB buffer and cache memory sizes of 1, 2, 4, and 8MB for its on-chip memory, replacing these two memories with MRAM implementations. As a result, when both the buffer and cache were replaced with SOT-MRAM, the energy consumption and speed could be reduced by 18.6% and 17.9%, respectively. In addition, the performance per unit area was improved by more than 36.4%. Replacing SRAM with spin-transfer torque MRAM is not suitable for inference devices, because the latency is significantly worse as a result of its slow write operation.

View full abstract

Download PDF (1157K)
A Learning-based Control Scheme for MTJ-based Non-volatile Flip-Flops

Shota Nakabeppu, Nobuyuki Yamasaki

Article type: Regular Paper
Subject area: Architecture Design Methodology
2024 Volume 17 Pages 16-35
Published: 2024
Released on J-STAGE: February 28, 2024

DOIhttps://doi.org/10.2197/ipsjtsldm.17.16

JOURNAL FREE ACCESS

Show abstractHide abstract

A magnetic tunnel junction (MTJ) based non-volatile flip-flop (NVFF) is attractive for non-volatile power gating to reduce power consumption and for non-volatile checkpointing to improve fault tolerance. An MTJ-based NVFF can perform a store operation to write the slave latch value to the MTJs, non-volatile devices, and a restore operation to write the MTJs value to the slave latch. However, a store operation is a stochastic operation. The store operations' success rate depends on their duration, NVFF characteristics, voltage, and temperature. Their success rate changes statically because each NVFF has different characteristics due to process variation in actual chips. Their success rate changes dynamically because voltage and temperature change dynamically depending on operating environments. Our goal is to reduce the checkpoint creation's energy consumption while ensuring its success. We propose a learning-based hardware scheme that dynamically finds the appropriate parameters to achieve our goal. The proposed scheme consists of a machine-learning unit and an exploration unit. The machine-learning unit learns and predicts the store operations' success rate by inputting their duration, voltage, and temperature. The exploration unit explores the trained machine-learning unit to find the appropriate parameters. The evaluation shows that the proposed scheme could achieve our goal.

View full abstract

Download PDF (13030K)
Design of Reference-free Flash ADC With On-chip Rank-based Comparator Selection Using Multiple Comparator Groups

Takehiro Kitamura, Takashi Hisakado, Osami Wada, Mahfuzul Islam

Article type: Regular Paper
Subject area: Low Power Design Methodology
2024 Volume 17 Pages 36-43
Published: 2024
Released on J-STAGE: February 28, 2024

DOIhttps://doi.org/10.2197/ipsjtsldm.17.36

JOURNAL FREE ACCESS

Show abstractHide abstract

Statistical element selection has been proposed to solve the offset voltage variation problem for a flash ADC. A calibration method based on order statistics has been proposed for statistical selection that does not require offset voltage measurement. This paper presents a design methodology of flash ADC with such calibration using multiple comparator groups. We validate our proposal with measurement results from test chips fabricated in a commercial 65nm general-purpose process. Measurement results confirm that rank-based comparator selection achieves a reference-free ADC. Compared to the baseline ADC, where only one group of comparators is used, the ADC with three groups significantly increases the linearity and input range under the same power consumption. As no reference voltage and DACs are required, the proposed ADC design will help realize ADCs in advanced process nodes with lower power consumption.

View full abstract

Download PDF (1749K)

Register with J-STAGE for free!