IEICE Transactions on Electronics

Special Section on Low-Power and High-Speed Chips

FOREWORD

Fumio ARAKAWA, Makoto IKEDA

2020Volume E103.CIssue 3 Pages 66-67
Published: March 01, 2020
Released on J-STAGE: March 01, 2020

DOIhttps://doi.org/10.1587/transele.2019LHF0001

JOURNAL FREE ACCESS

Download PDF (165K)
An Accuracy-Configurable Adder for Low-Power Applications

Tongxin YANG, Toshinori SATO, Tomoaki UKEZONO

Article type: PAPER
2020Volume E103.CIssue 3 Pages 68-76
Published: March 01, 2020
Released on J-STAGE: March 01, 2020

DOIhttps://doi.org/10.1587/transele.2019LHP0002

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Addition is a key fundamental function for many error-tolerant applications. Approximate addition is considered to be an efficient technique for trading off energy against performance and accuracy. This paper proposes a carry-maskable adder whose accuracy can be configured at runtime. The proposed scheme can dynamically select the length of the carry propagation to satisfy the quality requirements flexibly. Compared with a conventional ripple carry adder and a conventional carry look-ahead adder, the proposed 16-bit adder reduced the power consumption by 54.1% and 57.5%, respectively, and the critical path delay by 72.5% and 54.2%, respectively. In addition, results from an image processing application indicate that the quality of processed images can be controlled by the proposed adder. Good scalability of the proposed adder is demonstrated from the evaluation results using a 32-bit length.

View full abstract

Download PDF (2040K)
Low Delay 4K 120fps HEVC Decoder with Parallel Processing Architecture

Ken NAKAMURA, Daisuke KOBAYASHI, Yuya OMORI, Tatsuya OSAWA, Takayuki O ...

Article type: PAPER
2020Volume E103.CIssue 3 Pages 77-84
Published: March 01, 2020
Released on J-STAGE: March 01, 2020

DOIhttps://doi.org/10.1587/transele.2019LHP0005

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In this paper, we describe a novel low-delay 4K 120-fps real-time HEVC decoder with a parallel processing architecture that conforms to the HEVC main 4:2:2 10 profile. It supports the hierarchical temporal scalable streams required for Ultra High Definition high-frame-rate broadcasting and also supports low-delay and high-bitrate decoding for video transmission uses. To achieve this support, the decoding processes are parallelized and pipelined at the frame level, slice level, and coding tree unit row level. The proposed decoder was implemented on three FPGAs operated at 133 and 150 MHz, and it achieved 300-Mbps stream decoding and 37-msec end-to-end delay with our concurrently developed 4K 120-fps encoder.

View full abstract

Download PDF (2903K)
Compiler Software Coherent Control for Embedded High Performance Multicore

Boma A. ADHI, Tomoya KASHIMATA, Ken TAKAHASHI, Keiji KIMURA, Hironori ...

Article type: PAPER
2020Volume E103.CIssue 3 Pages 85-97
Published: March 01, 2020
Released on J-STAGE: March 01, 2020

DOIhttps://doi.org/10.1587/transele.2019LHP0008

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

The advancement of multicore technology has made hundreds or even thousands of cores processor on a single chip possible. However, on a larger scale multicore, a hardware-based cache coherency mechanism becomes overwhelmingly complicated, hot, and expensive. Therefore, we propose a software coherence scheme managed by a parallelizing compiler for shared-memory multicore systems without a hardware cache coherence mechanism. Our proposed method is simple and efficient. It is built into OSCAR automatic parallelizing compiler. The OSCAR compiler parallelizes the coarse grain task, analyzes stale data and line sharing in the program, then solves those problems by simple program restructuring and data synchronization. Using our proposed method, we compiled 10 benchmark programs from SPEC2000, SPEC2006, NAS Parallel Benchmark (NPB), and MediaBench II. The compiled binaries then are run on Renesas RP2, an 8 cores SH-4A processor, and a custom 8-core Altera Nios II system on Altera Arria 10 FPGA. The cache coherence hardware on the RP2 processor is only available for up to 4 cores. The RP2's cache coherence hardware can also be turned off for non-coherence cache mode. The Nios II multicore system does not have any hardware cache coherence mechanism; therefore, running a parallel program is difficult without any compiler support. The proposed method performed as good as or better than the hardware cache coherence scheme while still provided the correct result as the hardware coherence mechanism. This method allows a massive array of shared memory CPU cores in an HPC setting or a simple non-coherent multicore embedded CPU to be easily programmed. For example, on the RP2 processor, the proposed software-controlled non-coherent-cache (NCC) method gave us 2.6 times speedup for SPEC 2000 “equake” with 4 cores against sequential execution while got only 2.5 times speedup for 4 cores MESI hardware coherent control. Also, the software coherence control gave us 4.4 times speedup for 8 cores with no hardware coherence mechanism available.

View full abstract

Download PDF (2908K)
Local Memory Mapping of Multicore Processors on an Automatic Parallelizing Compiler

Yoshitake OKI, Yuto ABE, Kazuki YAMAMOTO, Kohei YAMAMOTO, Tomoya SHIRA ...

Article type: PAPER
2020Volume E103.CIssue 3 Pages 98-109
Published: March 01, 2020
Released on J-STAGE: March 01, 2020

DOIhttps://doi.org/10.1587/transele.2019LHP0010

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Utilization of local memory from real-time embedded systems to high performance systems with multi-core processors has become an important factor for satisfying hard deadline constraints. However, challenges lie in the area of efficiently managing the memory hierarchy, such as decomposing large data into small blocks to fit onto local memory and transferring blocks for reuse and replacement. To address this issue, this paper presents a compiler optimization method that automatically manage local memory of multi-core processors. The method selects and maps multi-dimensional data onto software specified memory blocks called Adjustable Blocks. These blocks are hierarchically divisible with varying sizes defined by the features of the input application. Moreover, the method introduces mapping structures called Template Arrays to maintain the indices of the decomposed multi-dimensional data. The proposed work is implemented on the OSCAR automatic parallelizing compiler and evaluations were performed on the Renesas RP2 8-core processor. Experimental results from NAS Parallel Benchmark, SPEC benchmark, and multimedia applications show the effectiveness of the method, obtaining maximum speed-ups of 20.44 with 8 cores utilizing local memory from single core sequential versions that use off-chip memory.

View full abstract

Download PDF (1468K)

Regular Section

Analysis of Antenna Performance Degradation due to Coupled Electromagnetic Interference from Nearby Circuits

Hosang LEE, Jawad YOUSAF, Kwangho KIM, Seongjin MUN, Chanseok HWANG, W ...

Article type: PAPER
Subject area: Electromagnetic Theory
2020Volume E103.CIssue 3 Pages 110-118
Published: March 01, 2020
Released on J-STAGE: March 01, 2020
Advance online publication: August 27, 2019

DOIhttps://doi.org/10.1587/transele.2019ECP5019

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

This paper analyzes and compares two methods to estimate electromagnetically coupled noises introduced to an antenna due to the nearby circuits at a circuit design stage. One of them is to estimate the power spectrum, and the other one is to estimate the active S₁₁ parameter at the victim antenna, respectively, and both of them use simulated standard S-parameters for the electromagnetic coupling in the circuit. They also need the assumed or measured excitation of noise sources. To confirm the validness of the two methods, an evaluation board consisting of an antenna and noise sources were designed and fabricated in which voltage controlled oscillator (VCO) chips are placed as noise sources. The generated electromagnetic noises are transferred to an antenna via loop-shaped transmission lines, degrading the performance of the antenna. In this paper, detailed analysis procedures are described using the evaluation board, and it is shown that the two methods are equivalent to each other in terms of the induced voltages in the antenna. Finally, a procedure to estimate antenna performance degradation at the design stage is summarized.

View full abstract

Download PDF (2833K)
Prediction of DC-AC Converter Efficiency Degradation due to Device Aging Using a Compact MOSFET-Aging Model

Kenshiro SATO, Dondee NAVARRO, Shinya SEKIZAKI, Yoshifumi ZOKA, Naoto ...

Article type: PAPER
Subject area: Semiconductor Materials and Devices
2020Volume E103.CIssue 3 Pages 119-126
Published: March 01, 2020
Released on J-STAGE: March 01, 2020
Advance online publication: September 02, 2019

DOIhttps://doi.org/10.1587/transele.2019ECP5010

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

The degradation of a SiC-MOSFET-based DC-AC converter-circuit efficiency due to aging of the electrically active devices is investigated. The newly developed compact aging model HiSIM_HSiC for high-voltage SiC-MOSFETs is used in the investigation. The model considers explicitly the carrier-trap-density increase in the solution of the Poisson equation. Measured converter characteristics during a 3-phase line-to-ground (3LG) fault is correctly reproduced by the model. It is verified that the MOSFETs experience additional stress due to the high biases occurring during the fault event, which translates to severe MOSFET aging. Simulation results predict a 0.5% reduction of converter efficiency due to a single 70ms-3LG, which is equivalent to a year of operation under normal conditions, where no additional stress is applied. With the developed compact model, prediction of the efficiency degradation of the converter circuit under prolonged stress, for which measurements are difficult to obtain and typically not available, is also feasible.

View full abstract

Download PDF (1893K)
Range Points Migration Based Spectroscopic Imaging Algorithm for Wide-Beam Terahertz Subsurface Sensor

Takamaru MATSUI, Shouhei KIDERA

Article type: BRIEF PAPER
Subject area: Electromagnetic Theory
2020Volume E103.CIssue 3 Pages 127-130
Published: March 01, 2020
Released on J-STAGE: March 01, 2020
Advance online publication: September 25, 2019

DOIhttps://doi.org/10.1587/transele.2019ECS6005

JOURNAL FREE ACCESS

Show abstractHide abstract

Here, we present a novel spectroscopic imaging method based on the boundary-extraction scheme for wide-beam terahertz (THz) three-dimensional imaging. Optical-lens-focusing systems for THz subsurface imaging generally require the depth of the object from the surface to be input beforehand to achieve the desired azimuth resolution. This limitation can be alleviated by incorporating a wide-beam THz transmitter into the synthetic aperture to automatically change the focusing depth in the post-signal processing. The range point migration (RPM) method has been demonstrated to have significant advantages in terms of imaging accuracy over the synthetic-aperture method. Moreover, in the RPM scheme, spectroscopic information can be easily associated with each scattering center. Thus, we propose an RPM-based terahertz spectroscopic imaging method. The finite-difference time-domain-based numerical analysis shows that the proposed algorithm provides accurate target boundary imaging associated with each frequency-dependent characteristic.

View full abstract

Download PDF (2831K)

Register with J-STAGE for free!