IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Online ISSN : 1745-1337
Print ISSN : 0916-8508
Volume E99.A , Issue 7
Showing 1-29 articles out of 29 articles from the selected issue
Special Section on Design Methodologies for System on a Chip
• Masahiro FUKUI
2016 Volume E99.A Issue 7 Pages 1277
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
• Koki IGAWA, Masao YANAGISAWA, Nozomu TOGAWA
Type: PAPER
2016 Volume E99.A Issue 7 Pages 1278-1293
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
In order to tackle a process-variation problem, we can define several scenarios, each of which corresponds to a particular LSI behavior, such as a typical-case scenario and a worst-case scenario. By designing a single LSI chip which realizes multiple scenarios simultaneously, we can have a process-variation-tolerant LSI chip. In this paper, we propose a multi-scenario high-level synthesis algorithm for variation-tolerant floorplan-driven design targeting new distributed-register architectures, called HDR architectures. We assume two scenarios, a typical-case scenario and a worst-case scenario, and realize them onto a single chip. We first schedule/bind each of the scenarios independently. After that, we commonize the scheduling/binding results for the typical-case and worst-case scenarios and thus generate a commonized area-minimized floorplan result. At that time, we can explicitly take into account interconnection delays by using distributed-register architectures. Experimental results show that our algorithm reduces the latency of the typical-case scenario by up to 50% without increasing the latency of the worst-case scenario, compared with several existing methods.
• Koichi FUJIWARA, Kazushi KAWAMURA, Masao YANAGISAWA, Nozomu TOGAWA
Type: PAPER
2016 Volume E99.A Issue 7 Pages 1294-1310
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
Recently, high-level synthesis techniques for FPGA designs (FPGA-HLS techniques) are strongly required in various applications. Both interconnection delays and clock skews have a large impact on circuit performance implemented onto FPGA, which indicates the need for floorplan-driven FPGA-HLS algorithms considering them. To appropriately estimate interconnection delays and clock skews at HLS phase, a reasonable model to estimate them becomes essential. In this paper, we demonstrate several experiments to characterize interconnection delays and clock skews in FPGA and propose novel estimate models called “IDEF” and “CSEF”. In order to evaluate our models, we integrate them into a conventional floorplan-driven FPGA-HLS algorithm. Experimental results demonstrate that our algorithm can realize FPGA designs which reduce the latency by up to 22% compared with conventional approaches.
• Junghoon OH, Mineo KANEKO
Type: PAPER
2016 Volume E99.A Issue 7 Pages 1311-1322
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
As semiconductor technologies have advanced, the reliability problem caused by soft-errors is becoming one of the serious issues in LSIs. Moreover, multiple component errors due to single soft-errors also have become a serious problem. In this paper, we propose a method to synthesize multiple component soft-error tolerant application-specific datapaths via high-level synthesis. The novel feature of our method is speculative resource sharing between the retry parts and the secondary parts for time overhead mitigation. A scheduling algorithm using a special priority function to maximize speculative resource sharing is also an important feature of this study. Our approach can reduce the latency (schedule length) in many applications without deterioration of reliability and chip area compared with conventional datapaths without speculative resource sharing. We also found that our method is more effective when a computation algorithm possesses higher parallelism and a smaller number of resources is available.
• Takeshi SUGAWARA, Daisuke SUZUKI, Minoru SAEKI
Type: PAPER
2016 Volume E99.A Issue 7 Pages 1323-1333
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
The single-shot collision attack on RSA proposed by Hanley et al. is studied focusing on the difference between two operands of multiplier. It is shown that how leakage from integer multiplier and long-integer multiplication algorithm can be asymmetric between two operands. The asymmetric leakage is verified with experiments on FPGA and micro-controller platforms. Moreover, we show an experimental result in which success and failure of the attack is determined by the order of operands. Therefore, designing operand order can be a cost-effective countermeasure. Meanwhile we also show a case in which a particular countermeasure becomes ineffective when the asymmetric leakage is considered. In addition to the above main contribution, an extension of the attack by Hanley et al. using the signal-processing technique of Big Mac Attack is presented.
• Lian ZENG, Tieyuan PAN, Xin JIANG, Takahiro WATANABE
Type: PAPER
2016 Volume E99.A Issue 7 Pages 1334-1344
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
As the semiconductor technology continues to develop, hundreds of cores will be deployed on a single die in the future Chip-Multiprocessors (CMPs) design. Three-Dimensional Network-on-Chips (3D NoCs) has become an attractive solution which can provide impressive high performance. An efficient and deadlock-free routing algorithm is a critical to achieve the high performance of network-on-chip. Traditional methods based on deterministic and turn model are deadlock-free, but they are unable to distribute the traffic loads over the network. In this paper, we propose an efficient, adaptive and deadlock-free algorithm (EAR) based on a novel routing selection strategy in 3D NoC, which can distribute the traffic loads not only in intra-layers but also in inter-layers according to congestion information and path diversity. Simulation results show that the proposed method achieves the significant performance improvement compared with others.
• Tieyuan PAN, Li ZHU, Lian ZENG, Takahiro WATANABE, Yasuhiro TAKASHIMA
Type: PAPER
2016 Volume E99.A Issue 7 Pages 1345-1354
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
Recently, due to the development of design and manufacturing technologies for VLSI systems, an embedded system becomes more and more complex. Consequently, not only the performance of chips, but also the flexibility and dynamic adaptation of the implemented systems are required. To achieve these requirements, a partially reconfigurable device is promising. In this paper, we propose an efficient data structure to manage the reconfigurable units. And then, on the assumption that each task utilizes the rectangle shaped resources, a very simple MER enumeration algorithm based on this data structure is proposed. By utilizing the result of MER enumeration, the free space on the reconfigurable device can be used sufficiently. We analyze the complexity of the proposed algorithm and confirm its efficiency by experiments.
• Amir Masoud GHAREHBAGHI, Masahiro FUJITA
Type: PAPER
2016 Volume E99.A Issue 7 Pages 1355-1365
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
This paper presents a new approach for circuit matching using signatures. We have defined a signature based on topology of the fanin cones of the circuit elements. Given two circuits, first we find all the circuit elements with unique signature between the two input circuits. After that, we try to expand the matching area by our expansion rules as much as possible. We iteratively find the unique matches and expand the matching area until no further matching is possible. Our experiments on IWLS2005 benchmark suite show that our method is able to find the perfect matching between two 160,000-gate IP in 5 minutes. In addition, our method is more than one order of magnitude faster than our previous signature-based matching method, while the size of the matched area is comparable or larger.
• Junki KAWAGUCHI, Hayato MASHIKO, Yukihide KOHIRA
Type: PAPER
2016 Volume E99.A Issue 7 Pages 1366-1373
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
In general-synchronous framework, in which the clock is distributed periodically to each register but not necessarily simultaneously, circuit performance is expected to be improved compared to complete-synchronous framework, in which the clock is distributed periodically and simultaneously to each register. To improve the circuit performance more, logic synthesis for general-synchronous framework is required. In this paper, under the assumption that any clock schedule is realized by an ideal clock distribution circuit, when two or more cell libraries are available, a technology mapping method which assigns a cell to each gate in the given logic circuit by using integer linear programming is proposed. In experiments, we show the effectiveness of the proposed technology mapping method.
• Yusuke MATSUNAGA
Type: PAPER
2016 Volume E99.A Issue 7 Pages 1374-1380
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
This paper describes two speed-up techniques for Boolean matching of LUT-based circuits. One is one-hot encoding technique for variables representing input assignments. Though it requires more variables than existing binary encoding technique, almost all added clauses using one-hot encoding are binary clauses, which are suitable for efficient Boolean constraint propagation. The other is CEGAR (counter example guided abstraction refinement) technique which reduces the CPU time significantly. With both techniques, we can solve Boolean matching problem with 9 input function in 20 milliseconds on average, which is faster than the existing algorithms more than one order of magnitude.
• Takuya HIRATA, Ryuta NISHINO, Shigetoshi NAKATAKE, Masaya SHIMOYAMA, M ...
Type: PAPER
2016 Volume E99.A Issue 7 Pages 1381-1389
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
This paper presents a layout-dependent manufacturability for analog integrated circuits. We focus on the relative variability of an input op-amp-pair used in an instrumentation amplifier (in-amp). We propose a subblock-level matching layout style such that subblocks of the op-amp-pair are superimposed aiming to suppress the relative variability dependent on the layout. We fabricate chips according to five superposed layout styles and evaluate the relative variability in terms of the DC-offset, so that we demonstrate the most effective layout style. Besides, we provide a manufacturability simulation methodology to evaluate the in-amp considering the relative variability of the op-amp-pair based on the measurement results. Comparing the simulation result and the performances of fabricated in-amps, we are convinced our methodology can evaluate the layout-dependency of the manufacturability by the simulation.
• Hiromitsu AWANO, Masayuki HIROMOTO, Takashi SATO
Type: PAPER
2016 Volume E99.A Issue 7 Pages 1390-1399
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
An efficient Monte Carlo (MC) method for the calculation of failure probability degradation of an SRAM cell due to negative bias temperature instability (NBTI) is proposed. In the proposed method, a particle filter is utilized to incrementally track temporal performance changes in an SRAM cell. The number of simulations required to obtain stable particle distribution is greatly reduced, by reusing the final distribution of the particles in the last time step as the initial distribution. Combining with the use of a binary classifier, with which an MC sample is quickly judged whether it causes a malfunction of the cell or not, the total number of simulations to capture the temporal change of failure probability is significantly reduced. The proposed method achieves 13.4× speed-up over the state-of-the-art method.
• Song BIAN, Michihiro SHINTANI, Masayuki HIROMOTO, Takashi SATO
Type: PAPER
2016 Volume E99.A Issue 7 Pages 1400-1409
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
As technology further scales semiconductor devices, aging-induced device degradation has become one of the major threats to device reliability. Hence, taking aging-induced degradation into account during the design phase can greatly improve the reliability of the manufactured devices. However, accurately estimating the aging effect for extremely large circuits, like processors, is time-consuming. In this research, we focus on the negative bias temperature instability (NBTI) as the aging-induced degradation mechanism, and propose a fast and efficient way of estimating NBTI-induced delay degradation by utilizing static-timing analysis (STA) and simulation-based lookup table (LUT). We modeled each type of gates at different degradation levels, load capacitances and input slews. Using these gate-delay models, path delays of arbitrary circuits can be efficiently estimated. With a typical five-stage pipelined processor as the design target, by comparing the calculated delay from LUT with the reference delay calculated by a commercial circuit simulator, we achieved 4114 times speedup within 5.6% delay error.
• Koki ITO, Kazushi KAWAMURA, Yutaka TAMIYA, Masao YANAGISAWA, Nozomu TO ...
Type: LETTER
2016 Volume E99.A Issue 7 Pages 1410-1414
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
An (M,N)-field-data extractor reads out any consecutive N bytes from an M-byte register by connecting its input/output using a multiplexer (MUX) network. It is used in packet analysis and/or stream data processing for video/audio data. In this letter, we propose an efficient MUX network for an (M,N)-field-data extractor. By bi-partitioning a simple MUX network into an upper one and a lower one, we can theoretically reduce the number of required MUXs without increasing the MUX network depth. Experimental results show that we can reduce the gate count by up to 92% compared to a naive approach.
• Jen-Chieh LIU, Pei-Ying LEE
Type: LETTER
2016 Volume E99.A Issue 7 Pages 1415-1416
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
A 62ps timing resolution pulse generator (PG) is presented. The PG adopts the multi-phase ring oscillator and the pulse combiner circuit (PCC) to achieve the low timing error. The PCC can decide an arbitrary waveform via 16 phase outputs. PCC adopts the coarse-tuning stage (CTS) and the fine-tuning (FTS) to define the operational frequency range and the timing resolution, respectively. Hence, PCC uses edge combiner (EC) to combine the period window of CTS. The latency of PG is only 3 cycle times. The operational frequency range of PG is from 15MHz to 245MHz. The timing resolution and average accuracy of PG are 62.5ps and ±0.5 LSB, respectively. The RMS jitter and peak-to-peak jitter of PG are 6.55ps and 66.67ps, respectively, at 245MHz.
• Yining XU, Yang LIU, Junya KAIDA, Ittetsu TANIGUCHI, Hiroyuki TOMIYAMA
Type: LETTER
2016 Volume E99.A Issue 7 Pages 1417-1419
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
This paper proposes a static application mapping technique, based on integer linear programming, for non-hierarchical manycore embedded systems. Unlike previous work which was designed for hierarchical manycore SoCs, this work allows more flexible application mapping to achieve higher performance. The experimental results show the effectiveness of this work.
Regular Section
• Teerapong ORACHON, Taichi YOSHIDA, Somchart CHOKCHAITAM, Masahiro IWAH ...
Type: PAPER
Subject area: Digital Signal Processing
2016 Volume E99.A Issue 7 Pages 1420-1429
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
The lifting wavelet transform (WT) has been widely applied to image coding. Recently, the total number of lifting steps has been minimized introducing a non-separable 2D structure so that delay from input to output can be reduced in parallel processing. However the minimum lifting WT has a problem that its upper bound of the rate-distortion curve is lower than that of the standard lifting WT. This is due to the rounding noise generated inside the transform in its integer implementation. This paper reduces the rounding noise introducing channel scaling. The channel scaling is designed so that the dynamic range of signal values is fully utilized at each channel inside the transform. As a result, the signal to noise ratio is increased and therefore the upper bound of the minimum lifting WT in lossy coding is improved.
• Jing WANG, Qiang LI, Li DING, Hirofumi SHINOHARA, Yasuaki INOUE
Type: PAPER
Subject area: VLSI Design Technology and CAD
2016 Volume E99.A Issue 7 Pages 1430-1437
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
A CMOS bandgap reference circuit without resistors, which can successfully operate under 1V supply voltage is proposed. The improvement is realized by the technique of the voltage divider and a new current source. The most attractive merit is that the proposed circuit breaks the bottleneck of low supply voltage design caused by the constant bandgap voltage value (1.25V). Moreover, the temperature coefficient of the reference voltage Vref is improved by compensating the temperature dependence caused by the current source. The simulation results using a standard CMOS 0.18 um process show that the value of Vref can be achieved around 0.5 V with a minimum supply voltage of 0.85 V. Meanwhile, the temperature coefficient of the output voltage is only 3.5ppm/°C from 0 °C to 70 °C.
• Yuan CAO, Yonglin CAO, Jian GAO
Type: PAPER
Subject area: Cryptography and Information Security
2016 Volume E99.A Issue 7 Pages 1438-1445
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
Let Fq be a finite field of cardinality q, R=Fq[u]/<u4>=Fq+uFq+u2Fq+u3Fq (u4=0) which is a finite chain ring, and n be a positive integer satisfying gcd(q,n)=1. For any $\delta,\alpha\in \mathbb{F}_{q}^{\times}$, an explicit representation for all distinct (δ+αu2)-constacyclic codes over R of length n is given, and the dual code for each of these codes is determined. For the case of q=2m and δ=1, all self-dual (1+αu2)-constacyclic codes over R of odd length n are provided.
• Lin QI, Masaaki KATAYAMA
Type: PAPER
Subject area: Communication Theory and Signals
2016 Volume E99.A Issue 7 Pages 1446-1454
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
Performance evaluation of an improved multiband impulse radio ultra-wideband (MIR UWB) system based on sub-band selection is proposed in this paper. In the improved scheme, a data mapping algorithm is introduced to a conventional MIR UWB system, and out of all the sub-bands, only partial ones are selected to transmit information data, which can improve the flexibility of sub-bands/spectrum allocation, avoid interference and provide a variety of data rates. Given diagrams of a transmitter and receiver, the exact bit error rate (BER) of the improved system is derived. A comparison of system performance between the improved MIR UWB system and the conventional MIR UWB system is presented in different channels. Simulation results show that the improved system can achieve the same data rate and better BER performance than the conventional MIR UWB system under additive white Gaussian noise (AWGN), multipath fading and interference coexistence channels. In addition, different data transmission rates and BER performances can be easily achieved by an appropriate choice of system parameters.
• Thanh V. PHAM, Anh T. PHAM
Type: PAPER
Subject area: Communication Theory and Signals
2016 Volume E99.A Issue 7 Pages 1455-1464
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
This paper proposes and theoretically analyzes the performance of amplify-and-forward (AF) relaying free-space optical (FSO) systems using avalanche photodiode (APD) over atmospheric turbulence channels. APD is used at each relay node and at the destination for optical signal conversion and amplification. Both serial and parallel relaying configurations are considered and the subcarrier binary phase-shift keying (SC-BPSK) signaling is employed. Closed-form expressions for the outage probability and the bit-error rate (BER) of the proposed system are analytically derived, taking into account the accumulating amplification noise as well as the receiver noise at the relay nodes and at the destination. Monte-Carlo simulations are used to validate the theoretical analysis, and an excellent agreement between the analytical and simulation results is confirmed.
• Yuji KAMIYA, Toru NAGURA, Shigeki KAWAI, Tsuneo NAKATA
Type: PAPER
Subject area: Intelligent Transport System
2016 Volume E99.A Issue 7 Pages 1465-1472
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
In this paper, we propose an infrastructure-free precise positioning system by utilizing a variation of received radio broadcast signal strength against vehicle travel as fingerprints of road segments. Use of broadcast wave is considered advantageous in deployment cost and sample density that affects measurement reliability, compared to communication medium such as 802.11p-based V2X radio used in our previous paper. We also present preliminary experimental results that indicate potential of positioning at 20cm accuracy by using reception information of two FM radio channels broadcast from a station about 20km away from the test track
• Xiao Lei YUAN, Lu GAN, Hong Shu LIAO
Type: LETTER
Subject area: Digital Signal Processing
2016 Volume E99.A Issue 7 Pages 1473-1477
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
In this letter, a novel robust adaptive beamforming algorithm is addressed to improve the robustness against steering vector random errors (SVREs), which eliminates the signal of interest (SOI) component from the sample covariance matrix (SCM), based on interference-plus-noise covariance matrix (IPNCM) reconstruction over annulus uncertainty sets. Firstly, several annulus uncertainty sets are used to constrain the steering vectors (SVs) of both interferences and the SOI. Additionally the IPNCM is reconstructed according to its definition by estimating each interference SV over its own annulus uncertainty set via the subspace projection algorithm. Meanwhile, the SOI SV is estimated as the prime eigenvector of the SOI covariance matrix term calculated over its own annulus uncertainty set. Finally, a novel robust beamformer is formulated based on the new IPNCM and the SOI SV, and it outperforms other existing reconstruction-based beamformers when the SVREs exist, especially in low input signal-to-noise ratio (SNR) cases, which is proved through the simulation results.
• Seokhyun SON, Myoungjin KIM, Hyoseop SHIN
Type: LETTER
Subject area: Systems and Control
2016 Volume E99.A Issue 7 Pages 1478-1480
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
In this letter, an underground facility management system for effective underground facility management is suggested. The present underground facility management system uses a wired and wireless duplex communication method to enable seamless communication, and rapid responses to any failures encountered. In this letter, the architecture and components of underground facility management system supporting heterogeneous duplex communication is suggested, and relevant work flow is presented.
• Xiaojing DU, Shuguo LI
Type: LETTER
Subject area: Cryptography and Information Security
2016 Volume E99.A Issue 7 Pages 1481-1487
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
SM3 is a hash function standard defined by China. Unlike SHA-1 and SHA-2, it is hard for SM3 to speed up the throughput because it has more complicated compression function than other hash algorithm. In this paper, we propose a 4-round-in-1 structure to reduce the number of rounds, and a logical simplifying to move 3 adders and 3 XOR gates from critical path to the non-critical path. Based in SMIC 65nm CMOS technology, the throughput of SM3 can achieve 6.54Gbps which is higher than that of the reported designs.
• Le DONG, Tianli WANG, Jiao DU, Shanqi PANG
Type: LETTER
Subject area: Cryptography and Information Security
2016 Volume E99.A Issue 7 Pages 1488-1493
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
We present a rebound attack on the 4-branch type-2 generalized Feistel structure with an SPS round function, which is called the type-2 GFN-SPS in this paper. Applying a non-full-active-match technique, we construct a 6-round known-key truncated differential distinguisher, and it can deduce a near-collision attack on compression functions of this structure embedding the MMO or MP modes. Extending the 6-round attack, we build a 7-round truncated differential path to get a known-key differential distinguisher with seven rounds. The results give some evidences that this structure is not stronger than the type-2 GFN with an SP round function and not weaker than that with an SPSP round function against the rebound attack.
• Younghwan JUNG, Daehee KIM, Sunshin AN
Type: LETTER
Subject area: Communication Theory and Signals
2016 Volume E99.A Issue 7 Pages 1494-1498
Published: July 01, 2016
Released: July 01, 2016
JOURNALS RESTRICTED ACCESS
In this paper, we analyze two representative tree-based RFID anti-collision algorithms: the Query Tree protocol and the Binary Search algorithm. Based on the advantages and disadvantages of the two algorithms, we propose and evaluate two optimized anti-collision algorithms: the Optimized Binary Search, which performs better than the Query Tree Protocol with the same tag-side overhead, and the Optimized Binary Search with Multiple Collision Bits Resolution algorithm, which performs the best with an acceptable increase in tag-side processing overhead.