IEICE Transactions on Electronics
Online ISSN : 1745-1353
Print ISSN : 0916-8524
Volume E105.C, Issue 6
Displaying 1-13 of 13 articles from this issue
Special Section on Low-Power and High-Speed Chips
  • Fumio ARAKAWA, Makoto IKEDA
    2022 Volume E105.C Issue 6 Pages 207-208
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    JOURNAL FREE ACCESS
    Download PDF (166K)
  • Stanislav SEDUKHIN, Yoichi TOMIOKA, Kohei YAMAMOTO
    Article type: PAPER
    2022 Volume E105.C Issue 6 Pages 209-221
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    Advance online publication: December 03, 2021
    JOURNAL FREE ACCESS

    In this paper, starting from the algorithm, a performance- and energy-efficient 3D structure or shape of the Tensor Processing Engine (TPE) for CNN acceleration is systematically searched and evaluated. An optimal accelerator's shape maximizes the number of concurrent MAC operations per clock cycle while minimizes the number of redundant operations. The proposed 3D vector-parallel TPE architecture with an optimal shape can be very efficiently used for considerable CNN acceleration. Due to implemented support of inter-block image data independency, it is possible to use multiple of such TPEs for the additional CNN acceleration. Moreover, it is shown that the proposed TPE can also be uniformly used for acceleration of the different CNN models such as VGG, ResNet, YOLO, and SSD. We also demonstrate that our theoretical efficiency analysis is matched with the result of a real implementation for an SSD model to which a state-of-the-art channel pruning technique is applied.

    Download PDF (1265K)
  • Kentaro KAWAKAMI, Kouji KURIHARA, Masafumi YAMAZAKI, Takumi HONDA, Nao ...
    Article type: PAPER
    2022 Volume E105.C Issue 6 Pages 222-231
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    Advance online publication: December 03, 2021
    JOURNAL FREE ACCESS

    To accelerate deep learning (DL) processes on the supercomputer Fugaku, the authors have ported and optimized oneDNN for Fugaku's CPU, the Fujitsu A64FX. oneDNN is an open-source DL processing library developed by Intel for the x86_64 architecture. The A64FX CPU is based on the Armv8-A architecture. oneDNN dynamically creates the execution code for the computation kernels, which are implemented at the granularity of x86_64 instructions using Xbyak, the Just-In-Time (JIT) assembler for x86_64 architecture. To port oneDNN to A64FX, it must be rewritten into Armv8-A instructions using Xbyak_aarch64, the JIT assembler for the Armv8-A architecture. This is challenging because the number of steps to be rewritten exceeds several tens of thousands of lines. This study presents the Xbyak_translator_aarch64. Xbyak_translator_aarch64 is a binary translator that at runtime converts dynamically produced executable codes for the x86_64 architecture into executable codes for the Armv8-A architecture. Xbyak_translator_aarch64 eliminates the need to rewrite the source code for porting oneDNN to A64FX and allows us to port oneDNN to A64FX quickly.

    Download PDF (976K)
  • Shunsuke TSUKADA, Hikaru TAKAYASHIKI, Masayuki SATO, Kazuhiko KOMATSU, ...
    Article type: PAPER
    2022 Volume E105.C Issue 6 Pages 232-243
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    Advance online publication: December 03, 2021
    JOURNAL FREE ACCESS

    A hybrid memory architecture (HMA) that consists of some distinct memory devices is expected to achieve a good balance between high performance and large capacity. Unlike conventional memory architectures, the HMA needs the metadata for data management since the data are migrated between the memory devices during the execution of an application. The memory controller caches the metadata to avoid accessing the memory devices for the metadata reference. However, as the amount of the metadata increases in proportion to the size of the HMA, the memory controller needs to handle a large amount of metadata. As a result, the memory controller cannot cache all the metadata and increases the number of metadata references. This results in an increase in the access latency to reach the target data and degrades the performance. To solve this problem, this paper proposes a metadata prefetching mechanism for HMAs. The proposed mechanism loads the metadata needed in the near future by prefetching. Moreover, to increase the effect of the metadata prefetching, the proposed mechanism predicts the metadata used in the near future based on an address difference that is the difference between two consecutive access addresses. The evaluation results show that the proposed metadata prefetching mechanism can improve the instructions per cycle by up to 44% and 9% on average.

    Download PDF (969K)
Special Section on Progress & Trend of Superconductor-based Computers
  • Satoshi KOHJIRO
    2022 Volume E105.C Issue 6 Pages 244
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    JOURNAL FREE ACCESS
    Download PDF (90K)
  • Takahiro KAWAGUCHI, Naofumi TAKAGI
    Article type: INVITED PAPER
    2022 Volume E105.C Issue 6 Pages 245-250
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    Advance online publication: December 03, 2021
    JOURNAL FREE ACCESS

    A 32-bit arithmetic logic unit (ALU) is designed for a rapid single flux quantum (RSFQ) bit-parallel processor. In the ALU, clocked gates are partially replaced by clockless gates. This reduces the number of D flip flops (DFFs) required for path balancing. The number of clocked gates, including DFFs, is reduced by approximately 40 %, and size of the clock distribution network is reduced. The number of pipeline stages becomes modest. The layout design of the ALU and simulation results show the effectiveness of using clockless gates in wide datapath circuits.

    Download PDF (609K)
  • Naoki TAKEUCHI, Taiki YAMAE, Christopher L. AYALA, Hideo SUZUKI, Nobuy ...
    Article type: INVITED PAPER
    2022 Volume E105.C Issue 6 Pages 251-263
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    Advance online publication: January 19, 2022
    JOURNAL FREE ACCESS

    The adiabatic quantum-flux-parametron (AQFP) is an energy-efficient superconductor logic element based on the quantum flux parametron. AQFP circuits can operate with energy dissipation near the thermodynamic and quantum limits by maximizing the energy efficiency of adiabatic switching. We have established the design methodology for AQFP logic and developed various energy-efficient systems using AQFP logic, such as a low-power microprocessor, reversible computer, single-photon image sensor, and stochastic electronics. We have thus demonstrated the feasibility of the wide application of AQFP logic in future information and communications technology. In this paper, we present a tutorial review on AQFP logic to provide insights into AQFP circuit technology as an introduction to this research field. We describe the historical background, operating principle, design methodology, and recent progress of AQFP logic.

    Download PDF (6759K)
  • Fumihiro CHINA, Naoki TAKEUCHI, Hideo SUZUKI, Yuki YAMANASHI, Hirotaka ...
    Article type: PAPER
    2022 Volume E105.C Issue 6 Pages 264-269
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    Advance online publication: December 03, 2021
    JOURNAL RESTRICTED ACCESS

    The adiabatic quantum flux parametron (AQFP) is an energy-efficient, high-speed superconducting logic device. To observe the tiny output currents from the AQFP in experiments, high-speed voltage drivers are indispensable. In the present study, we develop a compact voltage driver for AQFP logic based on a Josephson latching driver (JLD), which has been used as a high-speed driver for rapid single-flux-quantum (RSFQ) logic. In the JLD-based voltage driver, the signal currents of AQFP gates are converted into gap-voltage-level signals via an AQFP/RSFQ interface and a four-junction logic gate. Furthermore, this voltage driver includes only 15 Josephson junctions, which is much fewer than in the case for the previously designed driver based on dc superconducting quantum interference devices (60 junctions). In measurement, we successfully operate the JLD-based voltage driver up to 4 GHz. We also evaluate the bit error rate (BER) of the driver and find that the BER is 7.92×10-10 and 2.67×10-3 at 1GHz and 4GHz, respectively.

    Download PDF (1327K)
  • Tomoyuki TANAKA, Christopher L. AYALA, Nobuyuki YOSHIKAWA
    Article type: PAPER
    2022 Volume E105.C Issue 6 Pages 270-276
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    Advance online publication: January 19, 2022
    JOURNAL RESTRICTED ACCESS

    Extremely energy-efficient logic devices are required for future low-power high-performance computing systems. Superconductor electronic technology has a number of energy-efficient logic families. Among them is the adiabatic quantum-flux-parametron (AQFP) logic family, which adiabatically switches the quantum-flux-parametron (QFP) circuit when it is excited by an AC power-clock. When compared to state-of-the-art CMOS technology, AQFP logic circuits have the advantage of relatively fast clock rates (5 GHz to 10 GHz) and 5 - 6 orders of magnitude reduction in energy before cooling overhead. We have been developing extremely energy-efficient computing processor components using the AQFP. The adder is the most basic computational unit and is important in the development of a processor. In this work, we designed and measured a 16-bit parallel prefix carry look-ahead Kogge-Stone adder (KSA). We fabricated the circuit using the AIST 10 kA/cm2 High-speed STandard Process (HSTP). Due to a malfunction in the measurement system, we were not able to confirm the complete operation of the circuit at the low frequency of 100 kHz in liquid He, but we confirmed that the outputs that we did observe are correct for two types of tests: (1) critical tests and (2) 110 random input tests in total. The operation margin of the circuit is wide, and we did not observe any calculation errors during measurement.

    Download PDF (2428K)
  • Taiki YAMAE, Naoki TAKEUCHI, Nobuyuki YOSHIKAWA
    Article type: PAPER
    2022 Volume E105.C Issue 6 Pages 277-282
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    Advance online publication: January 19, 2022
    JOURNAL RESTRICTED ACCESS

    The adiabatic quantum-flux-parametron (AQFP) is an energy-efficient superconductor logic device. In a previous study, we proposed a low-latency clocking scheme called delay-line clocking, and several low-latency AQFP logic gates have been demonstrated. In delay-line clocking, the latency between adjacent excitation phases is determined by the propagation delay of excitation currents, and thus the rising time of excitation currents should be sufficiently small; otherwise, an AQFP gate can switch before the previous gate is fully excited. This means that delay-line clocking needs high clock frequencies, because typical excitation currents are sinusoidal and the rising time depends on the frequency. However, AQFP circuits need to be tested in a wide frequency range experimentally. Hence, in the present study, we investigate AQFP circuits adopting delay-line clocking with square excitation currents to apply delay-line clocking in a low frequency range. Square excitation currents have shorter rising time than sinusoidal excitation currents and thus enable low frequency operation. We demonstrate an AQFP buffer chain with delay-line clocking using square excitation currents, in which the latency is approximately 20ps per gate, and confirm that the operating margin for the buffer chain is kept sufficiently wide at clock frequencies below 1GHz, whereas in the sinusoidal case the operating margin shrinks below 500MHz. These results indicate that AQFP circuits adopting delay-line clocking can operate in a low frequency range by using square excitation currents.

    Download PDF (2003K)
  • Tomohiro YAMAJI, Masayuki SHIRANE, Tsuyoshi YAMAMOTO
    Article type: INVITED PAPER
    2022 Volume E105.C Issue 6 Pages 283-289
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    Advance online publication: December 03, 2021
    JOURNAL FREE ACCESS

    A Josephson parametric oscillator (JPO) is an interesting system from the viewpoint of quantum optics because it has two stable self-oscillating states and can deterministically generate quantum cat states. A theoretical proposal has been made to operate a network of multiple JPOs as a quantum annealer, which can solve adiabatically combinatorial optimization problems at high speed. Proof-of-concept experiments have been actively conducted for application to quantum computations. This article provides a review of the mechanism of JPOs and their application as a quantum annealer.

    Download PDF (791K)
  • Shuhei TAMATE, Yutaka TABUCHI, Yasunobu NAKAMURA
    Article type: INVITED PAPER
    2022 Volume E105.C Issue 6 Pages 290-295
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    Advance online publication: December 03, 2021
    JOURNAL FREE ACCESS

    In this paper, we review the basic components of superconducting quantum computers. We mainly focus on the packaging and wiring technologies required to realize large-scalable superconducting quantum computers.

    Download PDF (667K)
  • Kenta SATO, Naonori SEGA, Yuta SOMEI, Hiroshi SHIMADA, Takeshi ONOMI, ...
    Article type: BRIEF PAPER
    2022 Volume E105.C Issue 6 Pages 296-299
    Published: June 01, 2022
    Released on J-STAGE: June 01, 2022
    Advance online publication: January 19, 2022
    JOURNAL FREE ACCESS

    We experimentally evaluated random number sequences generated by a superconducting hardware random number generator composed of a Josephson-junction oscillator, a rapid-single-flux-quantum (RSFQ) toggle flip-flop (TFF), and an RSFQ AND gate. Test circuits were fabricated using a 10 kA/cm2 Nb/AlOx/Nb integration process. Measurements were conducted in a liquid helium bath. The random numbers were generated for a trigger frequency of 500 kHz under the oscillating Josephson-junction at 29 GHz. 26 random number sequences of 20 kb length were evaluated for bias voltages between 2.0 and 2.7 mV. The NIST FIPS PUBS 140-2 tests were used for the evaluation. 100% pass rates were confirmed at the bias voltages of 2.5 and 2.6 mV. We found that the Monobit test limited the pass rates. As numerical simulations suggested, a detailed evaluation for the probability of obtaining “1” demonstrated the monotonical dependence on the bias voltage.

    Download PDF (479K)
feedback
Top