IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Online ISSN : 1745-1337
Print ISSN : 0916-8508
Volume E91.A , Issue 12
Showing 1-50 articles out of 59 articles from the selected issue
Special Section on VLSI Design and CAD Algorithms
• Nagisa ISHIURA
2008 Volume E91.A Issue 12 Pages 3413-3414
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
• Insoo KIM, Jincheol YOO, JongSoo KIM, Kyusun CHOI
Type: PAPER
Subject area: Physical Level Design
2008 Volume E91.A Issue 12 Pages 3415-3422
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Threshold Inverter Quantization (TIQ) technique has been gaining its importance in high speed flash A/D converters due to its fast data conversion speed. It eliminates the need of resistor ladders for reference voltages generation which requires substantial power consumption. The key to TIQ comparators design is to generate 2n - 1 different sized TIQ comparators for an n-bit A/D converter. This paper presents a highly efficient TIQ comparator design methodology based on an analytical model as well as SPICE simulation experimental model. One can find any sets of TIQ comparators efficiently using the proposed method. A 6-bit TIQ A/D converter has been designed in a 0.18µm standard CMOS technology using the proposed method, and compared to the previous measured results in order to verify the proposed methodology.
• Yoshiyuki KAWAKAMI, Makoto TERAO, Masahiro FUKUI, Shuji TSUKIYAMA
Type: PAPER
Subject area: Physical Level Design
2008 Volume E91.A Issue 12 Pages 3423-3430
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
With the advent of the deep submicron age, circuit performance is strongly impacted by process variations and the influence on the circuit delay to the power-supply voltage increases more and more due to CMOS feature size shrinkage. Power grid optimization which considers the timing error risk caused by the variations and IR drop becomes very important for stable and hi-speed operation of system-on-chip. Conventionally, a lot of power grid optimization algorithms have been proposed, and most of them use IR drop as their object functions. However, the IR drop is an indirect metric and we suspect that it is vague metric for the real goal of LSI design. In this paper, first, we propose an approach which uses the “timing error risk caused by IR drop” as a direct objective function. Second, the critical path map is introduced to express the existence of critical paths distributed in the entire chip. The timing error risk is decreased by using the critical path map and the new objective function. Some experimental results show the effectiveness.
• Yun YANG, Shinji KIMURA
Type: PAPER
Subject area: Physical Level Design
2008 Volume E91.A Issue 12 Pages 3431-3442
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
This paper proposes an efficient design algorithm for power/ground (P/G) network synthesis with dynamic signal consideration, which is mainly caused by Ldi/dt noise and Cdv/dt decoupling capacitance (DECAP) current in the distribution network. To deal with the nonlinear global optimization under synthesis constraints directly, the genetic algorithm (GA) is introduced. The proposed GA-based synthesis method can avoid the linear transformation loss and the restraint condition complexity in current SLP, SQP, ICG, and random-walk methods. In the proposed Hybrid Grid Synthesis algorithm, the dynamic signal is simulated in the gene disturbance process, and Trapezoidal Modified Euler (TME) method is introduced to realize the precise dynamic time step process. We also use a hybrid-SLP method to reduce the genetic execute time and increase the network synthesis efficiency. Experimental results on given power distribution network show the reduction on layout area and execution time compared with current P/G network synthesis methods.
• Xiaoyi WANG, Jin SHI, Yici CAI, Xianlong HONG
Type: PAPER
Subject area: Physical Level Design
2008 Volume E91.A Issue 12 Pages 3443-3450
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
It's a trend to consider the power supply integrity at early stage to improve the design quality. Specifically, floorplanning process is modified to improve the power supply as well. In the modified floorplanning process, both the floorplan and power/ground (P/G) network are adjusted to search for optimal floorplan as well as the most robust power supply. In this paper, we propose a novel algorithm to carry out this modified floorplanning. A new analytical method is proposed to estimate the voltage drop while the floorplan is varying constantly. This fast analytical voltage drop estimating method is plugged into the modified floorplanner to speed up the whole floorplanning process. Compared with previous methods, our algorithm can search for the optimal floorplan with consideration of power supply integrity more efficiently and therefore leads to better results. Furthermore, this paper also proposes a novel heuristic method to optimize the topology of P/G network. This optimization algorithm could construct a more robust power supply system. Experimental results show the method can speedup the IR-drop aware floorplanning process by about 10 times and reduce the routing area of P/G network while maintaining the floorplan quality and power supply integrity.
• Makoto SUGIHARA, Yusuke MATSUNAGA, Kazuaki MURAKAMI
Type: PAPER
Subject area: Physical Level Design
2008 Volume E91.A Issue 12 Pages 3451-3460
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Character projection (CP) lithography is utilized for maskless lithography and is a potential for the future photomask manufacture because it can project ICs much faster than point beam projection or variable-shaped beam (VSB) projection. In this paper, we first present a projection mask set development methodology for multi-column-cell (MCC) systems, in which column-cells can project patterns in parallel with the CP and VSB lithographies. Next, we present an INLP (integer nonlinear programming) model as well as an ILP (integer linear programming) model for optimizing a CP mask set of an MCC projection system so that projection time is reduced. The experimental results show that our optimization has achieved 33.4% less projection time in the best case than a naive CP mask development approach. The experimental results indicate that our CP mask set optimization method has virtually increased cell pattern objects on CP masks and has decreased VSB projection so that it has achieved higher projection throughput than just parallelizing two column-cells with conventional CP masks.
• Toshiki KANAMOTO, Yasuhiro OGASAHARA, Keiko NATSUME, Kenji YAMAGUCHI, ...
Type: LETTER
Subject area: Device and Circuit Modeling and Analysis
2008 Volume E91.A Issue 12 Pages 3461-3464
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
This paper studies impact of well edge proximity effect on circuit delay, based on model parameters extracted from test structures in an industrial 65nm wafer process. Experimental results show that up to 10% of delay increase arises by the well edge proximity effect in the 65nm technology, and it depends on interconnect length. Furthermore, due to asymmetric increase in pMOS and nMOS threshold voltages, delay may decrease in spite of the threshold voltage increase. From these results, we conclude that considering WPE is indispensable to cell characterization in the 65nm technology.
• Yi WANG, Xuan ZENG, Jun TAO, Hengliang ZHU, Wei CAI
Type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2008 Volume E91.A Issue 12 Pages 3465-3473
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
In this paper, we propose an Adaptive Stochastic Collocation Method for block-based Statistical Static Timing Analysis (SSTA). A novel adaptive method is proposed to perform SSTA with delays of gates and interconnects modeled by quadratic polynomials based on Homogeneous Chaos expansion. In order to approximate the key atomic operator MAX in the full random space during timing analysis, the proposed method adaptively chooses the optimal algorithm from a set of stochastic collocation methods by considering different input conditions. Compared with the existing stochastic collocation methods, including the one using dimension reduction technique and the one using Sparse Grid technique, the proposed method has 10x improvements in the accuracy while using the same order of computation time. The proposed algorithm also shows great improvement in accuracy compared with a moment matching method. Compared with the 10,000 Monte Carlo simulations on ISCAS85 benchmark circuits, the results of the proposed method show less than 1% error in the mean and variance, and nearly 100x speeds up.
• Masanori HASHIMOTO, Jangsombatsiri SIRIPORN, Akira TSUCHIYA, Haikun ZH ...
Type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2008 Volume E91.A Issue 12 Pages 3474-3480
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
This paper proposes a closed-form eye-diagram model for on-chip distortionless transmission lines with intentionally inserted shunt conductance. We derive expressions of eye-opening both in voltage and time, by assuming a piece-wise linear waveform model. The model is experimentally verified with various length, shunt conductance and resistive termination. We also apply the proposed model to design space exploration, and demonstrate that the proposed model helps estimate the optimal shunt conductance and resistive termination according to required signaling length and throughput.
• Shinya ABE, Masanori HASHIMOTO, Takao ONOYE
Type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2008 Volume E91.A Issue 12 Pages 3481-3487
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Influence of manufacturing variability on circuit performance has been increasing because of finer manufacturing process and lowered supply voltage. In this paper, we focus on mesh-style clock distribution which is believed to be effective for reducing clock skew, and we evaluate clock skew considering manufacturing and design variabilities. Considering MOS transistor variation — random and spatially-correlated variation — and non-uniform flip-flop (FF) placement, we demonstrate that spatially-correlated variation and severe non-uniform FF distribution can be major sources of clock skew. We also examine the dependency of clock skew on design parameters, and reveal that finer clock mesh does not necessarily reduce clock skew.
• Tae Il BAE, Jin Wook KIM, Young Hwan KIM
Type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2008 Volume E91.A Issue 12 Pages 3488-3496
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
As the semiconductor feature size decreases, the crosstalk due to the capacitive coupling of interconnects influences signal propagation delay more seriously. Moreover, the increase of the operating frequency further emphasizes the necessity of more accurate timing analysis. In this paper, we propose new gate models to calculate gate output waveforms under crosstalk effects, which can be used for gate-level delay estimation. We classify the operation modes of metal-oxide-semiconductor (MOS) devices of a gate into 3 regions, and then develop simple linear models for each region. In addition, we present a non-iterative gate modeling method that is more efficient than previous iterative methods. In the experiments, the proposed method exhibits a maximum error of 10.70% and an average error of 2.63% when it computes the 50% delays of two or three complementary MOS (CMOS) inverters driving parallel wires. In comparison, the existing method has a maximum error of 25.94% and an average error of 3.62% under these conditions.
• Hyoun Soo PARK, Wook KIM, Dai Joon HYUN, Young Hwan KIM
Type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2008 Volume E91.A Issue 12 Pages 3497-3505
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Block-based SSTA analyzes the timing variation of a chip caused by process variations effectively. However, block-based SSTA cannot identify critical nodes, nodes that highly influence the timing yield of a chip, used as the effective guidance of timing yield optimization. In this paper, we propose a new timing criticality to identify those nodes, referred to as the timing yield criticality (TYC). The proposed TYC is defined as the change in the timing yield, which is induced by the change in the mean arrival time at a node. For efficiency, we estimate the TYC through linear approximation instead of propagating the changed arrival time at a node to its fanouts. In experiments using the ISCAS 85 benchmark circuits, the proposed method estimated TYCs with the expense of 9.8% of the runtime for the exact computation. The proposed method identified the node that gives the greatest effect on the timing yield in all benchmark circuits, except C6288, while existing methods did not identify that for any circuit. In addition, the proposed method identified 98.4% of the critical nodes in the top 1% in the effect on the timing yield, while existing methods identified only about 10%.
• Yoshinobu HIGAMI, Kewal K. SALUJA, Hiroshi TAKAHASHI, Shin-ya KOBAYASH ...
Type: PAPER
Subject area: Logic Synthesis, Test and Verification
2008 Volume E91.A Issue 12 Pages 3506-3513
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Physical defects that are not covered by stuck-at fault or bridging fault model are increasing in LSI circuits designed and manufactured in modern Deep Sub-Micron (DSM) technologies. Therefore, it is necessary to target non-stuck-at and non-bridging faults. A stuck-open is one such fault model that captures transistor level defects. This paper presents two methods for maximizing stuck-open fault coverage using stuck-at test vectors. In this paper we assume that a test set to detect stuck-at faults is given and we consider two formulations for maximizing stuck-open coverage using the given test set as follows. The first problem is to form a test sequence by using each test vector multiple times, if needed, as long as the stuck-open coverage is increased. In this case the target is to make the resultant test sequence as short as possible under the constraint that the maximum stuck-open coverage is achieved using the given test set. The second problem is to form a test sequence by using each test vector exactly once only. Thus in this case the length of the test sequence is maintained as the number of given test vectors. In both formulations the stuck-at fault coverage does not change. The effectiveness of the proposed methods is established by experimental results for benchmark circuits.
• Youhua SHI, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI
Type: PAPER
Subject area: Logic Synthesis, Test and Verification
2008 Volume E91.A Issue 12 Pages 3514-3523
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
This paper presents a unified test compression technique for scan stimulus and unknown masking data with seamless integration of test generation, test compression and all unknown response masking for high quality manufacturing test cost reduction. Unlike prior test compression methods, the proposed approach considers the unknown responses during test pattern generation procedure, and then selectively encodes the less specified bits (either 1s or 0s) in each scan slice for compression while at the same time masks the unknown responses before sending them to the response compactor. The proposed test scheme could dramatically reduce test data volume as well as the number of required test channels by using only c tester channels to drive N internal scan chains, where c = [log2N] + 2. In addition, because all the unknown responses could be exactly masked before entering into the response compactor, test loss due to unknown responses would be eliminated. Experimental results on both benchmark circuits and larger designs indicated the effectiveness of the proposed technique.
• Keiichi SUEMITSU, Toshiaki ITO, Toshiki KANAMOTO, Masayuki TERAI, Sato ...
Type: PAPER
Subject area: Logic Synthesis, Test and Verification
2008 Volume E91.A Issue 12 Pages 3524-3530
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
This paper proposes a new parallel method of producing the adjacent net pair list from the LSI layouts, which is run on workstations connected with the network. The pair list contains pairs of adjacent nets and the probability of a bridging fault between them, and is used in fault diagnosis of LSIs. The proposed method partitions into regions each mask layer of the LSI layout, produces a pair list corresponding to each region in parallel and merges them into the entire pair list. It yields the accurate results, because it considers the faults between two wires containing different adjacent regions. The experimental results show that the proposed method has greatly reduced the processing time from more than 60hrs. to 3hrs. in case of 42M-gate LSIs.
• Lei CHEN, Takashi HORIYAMA, Yuichi NAKAMURA, Shinji KIMURA
Type: PAPER
Subject area: Logic Synthesis, Test and Verification
2008 Volume E91.A Issue 12 Pages 3531-3538
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Leakage power consumption of logic elements has become a serious problem, especially in the sub-100-nanometer process. In this paper, a novel power gating approach by using the controlling value of logic elements is proposed. In the proposed method, sleep signals of the power-gated blocks are extracted completely from the original circuits without any extra logic element. A basic algorithm and a probability-based heuristic algorithm have been developed to implement the basic idea. The steady maximum delay constraint has also been introduced to handle the delay issues. Experiments on the ISCAS'85 benchmarks show that averagely 15-36% of logic elements could be power gated at a time for random input patterns, and 3-31% of elements could be stopped under the steady maximum delay constraints. We also show a power optimization method for AND/OR tree circuits, in which more than 80% of gates can be power-gated.
• Masato INAGI, Yasuhiro TAKASHIMA, Yuichi NAKAMURA, Atsushi TAKAHASHI
Type: PAPER
Subject area: Logic Synthesis, Test and Verification
2008 Volume E91.A Issue 12 Pages 3539-3547
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
In multi-FPGA prototyping systems for circuit verification, serialized time-multiplexed I/O technique is used because of the limited number of I/O pins of an FPGA. The verification time depends on a selection of inter-FPGA signals to be time-multiplexed. In this paper, we propose a method that minimizes the verification time of multi-FPGA systems by finding an optimal selection of inter-FPGA signals to be time-multiplexed. In the experiments, it is shown that the estimated verification time is improved 38.2% on average compared with conventional methods.
• Alexander JESSER, Stefan LAEMMERMANN, Alexander PACHOLIK, Roland WEISS ...
Type: PAPER
Subject area: Logic Synthesis, Test and Verification
2008 Volume E91.A Issue 12 Pages 3548-3555
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Functional and formal verification are important methodologies for complex mixed-signal design validation. However the industry is still verifying such systems by pure simulation. This process lacks on error localization and formal verifications methods. This is the existing verification gap between the analog and digital blocks within a mixed-signal system. Our approach improves the verification process by creating temporal properties named mixed-signal assertions which are described by a combination of digital assertions and analog properties. The proposed method is a new assertion-based verification flow for designing mixed-signal circuits. The effectiveness of the approach is demonstrated on a Σ/Δ-converter.
• Masanari NISHIMURA, Nagisa ISHIURA, Yoshiyuki ISHIMORI, Hiroyuki KANBA ...
Type: LETTER
Subject area: High-Level Synthesis and System-Level Design
2008 Volume E91.A Issue 12 Pages 3556-3558
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
This letter presents a novel framework in high-level synthesis where hardware modules synthesized from functions in a given ANSI-C program can call the other software functions in the program. This enables high-level synthesis from C programs that contains calls to hard-to-synthesize functions, such as dynamic memory management, I/O request, or very large and complex functions. A single-thread implementation scheme is shown, whose correctness has been verified through register transfer level simulation.
• Hongwei ZHU, Ilie I. LUICAN, Florin BALASA, Dhiraj K. PRADHAN
Type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2008 Volume E91.A Issue 12 Pages 3559-3567
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
In real-time data-dominated communication and multimedia processing applications, a multi-layer memory hierarchy is typically used to enhance the system performance and also to reduce the energy consumption. Savings of dynamic energy can be obtained by accessing frequently used data from smaller on-chip memories rather than from large background memories. This paper focuses on the reduction of the dynamic energy consumption in the memory subsystem of multidimensional signal processing systems, starting from the high-level algorithmic specification of the application. The paper presents a formal model which identifies those parts of arrays more intensely accessed, taking also into account the relative lifetimes of the signals. Tested on a two-layer memory hierarchy, this model led to savings of dynamic energy from 40% to over 70% relative to the energy used in the case of flat memory designs.
• Yuen-Hong Alvin HO, Chi-Un LEI, Hing-Kit KWAN, Ngai WONG
Type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2008 Volume E91.A Issue 12 Pages 3568-3575
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
In the context of multiple constant multiplication (MCM) design, we propose a novel common sub-expression elimination (CSE) algorithm that models the optimal synthesis of coefficients into a 0-1 mixed-integer linear programming (MILP) problem with a user-defined generic logic depth constraint. We also propose an efficient solution space, which combines all minimal signed digit (MSD) representations and the shifted sum (difference) of coefficients. In the examples we demonstrate, the combination of the proposed algorithm and solution space gives a better solution comparing to existing algorithms.
• Maziar GOUDARZI, Tadayuki MATSUMURA, Tohru ISHIHARA
Type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2008 Volume E91.A Issue 12 Pages 3576-3584
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
The share of leakage in cache power consumption increases with technology scaling. Choosing a higher threshold voltage (Vth) and/or gate-oxide thickness (Tox) for cache transistors improves leakage, but impacts cell delay. We show that due to uncorrelated random within-die delay variation, only some (not all) of cells actually violate the cache delay after the above change. We propose to add a spare cache way to replace delay-violating cache-lines separately in each cache-set. By SPICE and gate-level simulations in a commercial 90nm process, we show that choosing higher Vth, Tox and adding one spare way to a 4-way 16KB cache reduces leakage power by 42%, which depending on the share of leakage in total cache power, gives up to 22.59% and 41.37% reduction of total energy respectively in L1 instruction- and L2 unified-cache with a negligible delay penalty, but without sacrificing cache capacity or timing-yield.
• Takayuki OBATA, Mineo KANEKO
Type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2008 Volume E91.A Issue 12 Pages 3585-3595
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
As well as the schedule affects system performance, the control skew, i.e., the arrival time difference of control signals between registers, can be utilized for improving the system performance, enhancing robustness against delay variations, etc. The simultaneous optimization of the control step assignment and the control skew assignment is more powerful technique in improving performance. In this paper, firstly, we prove that, even if the execution sequence of operations which are assigned to the same resource is fixed, the simultaneous optimization problem under a fixed clock period is NP-hard. Secondly, we propose a heuristic algorithm for the simultaneous control step and skew optimization under given clock period, and we show how much the simultaneous optimization improves system performance. This paper is the first one that uses the intentional skew to shorten control steps under a specified clock period. The proposed algorithm has the potential to play a central role in various scenarios of skew-aware high level synthesis.
• Hasitha Muthumala WAIDYASOORIYA, Masanori HARIYAMA, Michitaka KAMEYAMA
Type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2008 Volume E91.A Issue 12 Pages 3596-3606
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
This paper presents a high-level synthesis approach to minimize the total power consumption in behavioral synthesis under time and area constraints. The proposed method has two stages, functional unit (FU) energy optimization and interconnect energy optimization. In the first stage, active and inactive energies of the FUs are optimized using a multiple supply and threshold voltage scheme. Genetic algorithm (GA) based simultaneous assignment of supply and threshold voltages and module selection is proposed. The proposed GA based searching method can be used in large size problems to find a near-optimal solution in a reasonable time. In the second stage, interconnects are simplified by increasing their sharing. This is done by exploiting similar data transfer patterns among FUs. The proposed method is evaluated for several benchmarks under 90nm CMOS technology. The experimental results show that more than 40% of energy savings can be achieved by our proposed method.
• Jeong Ki KIM, Hyunseuk YOO, Moon Ho LEE
Type: LETTER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2008 Volume E91.A Issue 12 Pages 3607-3611
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
The weakness of implementation for LDPC encoder is that conventional binary Matrix Vector Multiplier has many clock cycles which lead to limited throughput. In this letter in order to construct efficient architecture, we target on IEEE 802.16e LDPC encoders. Over the standard H matrices with Circulant Permutation Matrices, we propose semi-parallel architecture by using cyclic right shift registers and exclusive-OR instead of complex Matrix Vector Multipliers. Proposed efficient encoder for IEEE 802.16e LDPC satisfies compact size and high throughput.
• Kentaro NAKAHARA, Shin'ichi KOUYAMA, Tomonori IZUMI, Hiroyuki OCHI, Yu ...
Type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2008 Volume E91.A Issue 12 Pages 3612-3621
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Recently, reconfigurable devices are widely used in the fields of small amount production and trial production. They are also expected to be utilized in such mission-critical fields as space development, because system update and pseudo-repair can be achieved remotely by reconfiguring. However, in the case of conventional reconfigurable devices, configuration memory upsets caused by radiation and alpha particles reconfigure the device unpredictably, resulting in fatal system failures. Therefore, a reconfigurable device with high fault-tolerance against configuration upsets is required. In this paper, we propose an architecture of a fault-tolerant reconfigurable device that autonomously repairs configuration upsets by itself without interrupting system operations. The device consists of a 2D array of “Autonomous-Repair Cells” each of which repairs its upsets autonomously. The architecture has a scalability in fault tolerance; a finer-grained Autonomous-Repair Cell provides higher fault-tolerance. To determine the architecture, we analyze four autonomous repair techniques of the cell experimentally. Then, two autonomous repair techniques, simple multiplexing (S.M.) and memory multiplexing (M.M.), are applied; the former to programmable logics and the latter to cell-to-cell routing resources. Through evaluation, we show that proposed device achieves more than 10 years average lifetime against configuration upsets even in a severe situation such as a satellite orbit.
• Wen JI, Yuta ABE, Takeshi IKENAGA, Satoshi GOTO
Type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2008 Volume E91.A Issue 12 Pages 3622-3629
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
In this paper, we propose a partially-parallel irregular LDPC decoder based on IEEE 802.11n standard targeting high throughput and small area applications. The design is based on a novel sum-delta message passing algorithm characterized as follows: (i) Decoding throughput is greatly improved by utilizing the difference value between the updated and the original value to remove redundant computations. (ii) Registers and memory are optimized to store only the frequently used messages to decrease the hardware cost. (iii) Techniques such as binary sorting, parallel column operation, high performance pipelining are used to further speed up the message passing procedure. The synthesis result in TSMC 0.18 CMOS technology demonstrates that for (648, 324) irregular LDPC code, our decoder achieves 7.5X improvement in throughput, which reaches 402Mbps at the frequency of 200MHz, with 11% area reduction. The synthesis result also demonstrates the competitiveness to the fully-parallel regular LDPC decoders in terms of the tradeoff between throughput, area and power.
• Tianruo ZHANG, Guifen TIAN, Takeshi IKENAGA, Satoshi GOTO
Type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2008 Volume E91.A Issue 12 Pages 3630-3637
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Intra coding in H.264/AVC has significantly enhanced video compression efficiency. However, computation complexity increases by the rate-distortion (RD) based mode decision. This paper proposes a novel fast mode decision algorithm in H.264/AVC intra prediction and its VLSI architecture. A novel edge-detection pattern is proposed and both edge-detection technique and spatial mode prediction technique are combined together to reduce the number of intra 4×4 candidate modes from 9 to an average of 2.50. VLSI architecture of intra mode decision module is designed with TSMC 0.18µm CMOS technology. The maximum frequency of 285MHz is achieved and 13.1k NAND gates are required. High frequency, efficient processing cycle reduction and small area make this design to be an excellent accelerator for HDTV 1080p@30fps real time encoder.
• Lan-Rong DUNG, Meng-Chun LIN
Type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2008 Volume E91.A Issue 12 Pages 3638-3650
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
This paper presents a memory-efficient motion estimation (ME) technique for high-resolution video compression. The main objective is to reduce the external memory access, especially for limited local memory resource. The reduction of memory access can successfully save the notorious power consumption. The key to reduce the memory accesses is based on center-biased algorithm in that the center-biased algorithm performs the motion vector (MV) searching with the minimum search data. While considering the data reusability, the proposed dual-search-windowing (DSW) approaches use the secondary windowing as an option per searching necessity. By doing so, the loading of search windows can be alleviated and hence reduce the required external memory bandwidth. The proposed techniques can save up to 81% of external memory bandwidth and require only 135 MBytes/sec, while the quality degradation is less than 0.2dB for 720p HDTV clips coded at 8Mbits/sec.
• Yukio MITSUYAMA, Kazuma TAKAHASHI, Rintaro IMAI, Masanori HASHIMOTO, T ...
Type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2008 Volume E91.A Issue 12 Pages 3651-3662
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
An area-efficient dynamically reconfigurable architecture is proposed, which is dedicated to media processing. To implement a compact but high performance device, which can be used in consumer applications, the reconfigurable architecture distinctively performs 8-bit operations required for media processing whereas fine-grained operations are executed with the cooperation of a host processor. A heterogeneous reconfigurable array is composed of four types of cells, for which configuration data size is reduced by focusing application domain on media processing. Implementation results show that a multi-standard video decoding can be achieved by the proposed reconfigurable architecture with 1.1×1.4mm2 in a 90nm CMOS technology.
Special Section on Signal Design and its Application in Communications
• Pingzhi FAN, Naoki SUEHIRO
2008 Volume E91.A Issue 12 Pages 3663-3664
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
• Claude CARLET
Type: INVITED PAPER
2008 Volume E91.A Issue 12 Pages 3665-3678
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
A function $F:\mathbb{F}_2^n\ ightarrow \mathbb{F}_2^n$ is almost perfect nonlinear (APN) if, for every a ≠ 0, b in $\mathbb{F}_2^n$, the equation $F(x)+F(x+a)=b$ has at most two solutions in $\mathbb{F}_2^n$. When used as an S-box in a block cipher, it contributes optimally to the resistance to differential cryptanalysis. The function F is almost bent (AB) if the minimum Hamming distance between all its component functions $v\cdot F$, $v\in \F_2^n \setminus \{0\}$ (where “·” denotes any inner product in $\mathbb{F}_2^n$) and all affine Boolean functions on $\mathbb{F}_2^n$ takes the maximal value $2^{n-1}-2^{\frac{n-1}{2}}$. AB functions exist for n odd only and contribute optimally to the resistance to the linear cryptanalysis. Every AB function is APN, and in the n odd case, any quadratic APN function is AB. The APN and AB properties are preserved by affine equivalence: $F\sim F'$ if $F'=A_1\circ F\circ A_2$, where A1, A2 are affine permutations. More generally, they are preserved by CCZ-equivalence, that is, affine equivalence of the graphs of $F$: $\{(x,F(x)) \ | \ x\in \F_{2^n}\}$ and of F'. Until recently, the only known constructions of APN and AB functions were CCZ-equivalent to power functions F(x)=xd over finite fields ($\mathbb{F}_{2^n}$ being identified with $\mathbb{F}_2^n$ and an inner product being x · y = tr(xy) where tr is the trace function). Several recent infinite classes of APN functions have been proved CCZ-inequivalent to power functions. In this paper, we describe the state of the art in the domain and we also present original results. We indicate what are the most important open problems and make some new observations about them. Many results presented are from joint works with Lilya Budaghyan, Gregor Leander and Alexander Pott.
• Young-Joon KIM, Yun-Pyo HONG, Hong-Yeop SONG
Type: PAPER
Subject area: Nonlinear Problems
2008 Volume E91.A Issue 12 Pages 3679-3684
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
We define a new quaternary cyclotomic sequences of length 2p, where p is an odd prime. We compute the autocorrelation of these sequences. In terms of magnitude, these sequences have the autocorrelations with at most 4 values.
• Yun Kyoung HAN, Kyeongcheol YANG
Type: PAPER
Subject area: Sequence
2008 Volume E91.A Issue 12 Pages 3685-3690
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
In this paper we introduce new M-ary sequences of length pq, called generalized M-ary related-prime sequences, where p and q are distinct odd primes, and M is a common divisor of p - 1 and q - 1. We show that their out-of-phase autocorrelation values are upper bounded by the maximum between q - p + 1 and 5. We also construct a family of generalized M-ary related-prime sequences and show that the maximum correlation of the proposed sequence family is upper bounded by p + q - 1.
• Zhengchun ZHOU, Zhen PAN, Xiaohu TANG
Type: PAPER
Subject area: Sequence
2008 Volume E91.A Issue 12 Pages 3691-3697
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
In this paper, based on interleaved technique, we present a new method of constructing zero correlation zone (ZCZ) sequence sets. For any perfect sequence of length m(2k + 1) with m > 2, k ≥ 0 and an arbitrary Hadamard matrix of order T > 2, the proposed construction can generate new optimal ZCZ sequence sets in which all the sequences are cyclically distinct.
• Chenggao HAN, Takeshi HASHIMOTO, Naoki SUEHIRO
Type: PAPER
Subject area: Sequence
2008 Volume E91.A Issue 12 Pages 3698-3702
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
In approximately synchronous CDMA (AS-CDMA) systems, zero correlation zone (ZCZ) sequences are known as the sequences to eliminate co-channel and multi-path interferences. Therefore, numerous constructions of zero correlation zone (ZCZ) sequences have been introduced e.g. based on perfect sequences and complete complementary codes etc. However, the previous construction method which based on complete complementary code is lacking for merit figures when none of whose elements are zero. In this paper, a new construction method of ZCZ sequences based on complete complementary codes is proposed. By proposed method, non zero elements ZCZ sequences whose merit figure is greater than 1/2 are constructable.
• Chao ZHANG, Xiaoming TAO, Jianhua LU
Type: PAPER
Subject area: Sequence
2008 Volume E91.A Issue 12 Pages 3703-3711
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Zero Correlation Zone (ZCZ) sequences have been confirmed the capability in interference mitigation in multipath fading channel. On the other hand, Orthogonal Variable Spreading Factor (OVSF) codes have been successfully applied in WCDMA for separating different channels with different transmission capacity. In this paper, novel OVSF-ZCZ sequences originated from LS and GO sequences have been proposed for CDMA systems with different service requirements. The construction method is discussed and the performance of the system is evaluated.
• Xiaoming TAO, Chao ZHANG, Jianhua LU, Naoki SUEHIRO
Type: PAPER
2008 Volume E91.A Issue 12 Pages 3712-3722
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Orthogonal Signal Division Multiplexing (OSDM), also known as SD-OFDM, has been proposed for information transmission with high spectrum efficiency. In this paper, a new signal construction method named Adaptive Carrier Interferometry OSDM (ACI-OSDM) is proposed for time-frequency selective fading channel. Particularly, the Adaptive CI codes originated from CI-OFDM are employed in the frequency domain of OSDM signal. Compared with traditional OFDM, the ACI-OSDM improves the performance considerably of broadband transmission, i.e., spectrum efficiency, Peak-to-Average Power Ratio (PAPR) mitigation and interference cancelation in the high speed mobile environment with multipath emission, e.g. super express train with speed more than 250km/h.
• Kazuki CHIBA, Masanori HAMAMURA
Type: PAPER
2008 Volume E91.A Issue 12 Pages 3723-3730
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
We propose multitone-hopping code-division multiple access (MH-CDMA) using a feedback-controlled hopping pattern (FCHP) (FCHP/MH-CDMA). In the FCHP/MH-CDMA, part of the filter coefficients of an adaptive finite-duration impulse response (FIR) filter receiver are fed back to a transmitter, in which they are used as an updated hopping pattern. Each chip of the updated hopping pattern consists of plural tones. As a result, it is shown that the FCHP/MH-CDMA provides us with an excellent asynchronous, decentralized multiple-access performance over time-invariant multipath channels.
• Peisheng WANG, Yuan LUO, A.J. Han VINCK
Type: PAPER
Subject area: Coding Theory
2008 Volume E91.A Issue 12 Pages 3731-3737
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
The generalized Hamming weight played an important role in coding theory. In the study of the wiretap channel of type II, the generalized Hamming weight was extended to a two-code format. Two equivalent concepts of the generalized Hamming weight hierarchy and its two-code format, are the inverse dimension/length profile (IDLP) and the inverse relative dimension/length profile (IRDLP), respectively. In this paper, the Singleton upper bound on the IRDLP is improved by using a quotient subcode set and a subset with respect to a generator matrix, respectively. If these new upper bounds on the IRDLP are achieved, in the corresponding coordinated two-party wire-tap channel of type II, the adversary cannot learn more from the illegitimate party.
• Ali ÖZEN, Ismail KAYA, Birol SOYSAL
Type: PAPER
Subject area: Channel Equalization
2008 Volume E91.A Issue 12 Pages 3738-3744
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Because of the fact that mobile communication channel changes by time, it is necessary to employ adaptive channel equalizers in order to combat the distorting effects of the channel. Least Mean Squares (LMS) algorithm is one of the most popular channel equalization algorithms and is preferred over other algorithms such as the Recursive Least Squares (RLS) and Maximum Likelihood Sequence Estimation (MLSE) when simplicity is the dominant decision factor. However, LMS algorithm suffers from poor performance and convergence speed within the training period specified by most of the standards. The aim of this study is to improve the convergence speed and performance of the LMS algorithm by adjusting the step size using fuzzy logic. The proposed method is compared with the Channel Matched Filter-Decision Feedback Equalizer (CMF-DFE) [1] which provides multi path propagation diversity by collecting the energy in the channel, Minimum Mean Square Error-Decision Feedback Equalizer (MMSE-DFE) [2] which is one of the most successful equalizers for the data packet transmission, normalized LMS-DFE (N-LMS-DFE) [3], variable step size (VSS) LMS-DFE [4], fuzzy LMS-DFE [5], [6] and RLS-DFE [7]. The obtained simulation results using HIPERLAN/1 standards have demonstrated that the proposed LMS-DFE algorithm based on fuzzy logic has considerably better performance than others.
• Satoshi UEHARA, Shuichi JONO, Yasuyuki NOGAMI
Type: LETTER
Subject area: Sequence
2008 Volume E91.A Issue 12 Pages 3745-3748
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
A class of zero-correlation zone (ZCZ) sequences constructed by the recursive procedure from a perfect sequence and a unitary matrix was proposed by Torii, Nakamura, and Suehiro [1]. In the reference [1], three parameters, s.t., the sequence length, the family size and the length of the ZCZ, were evaluated for a general estimate of the performance of the ZCZ sequences. In this letter, we give more detailed distributions of that correlation values are zero on their ZCZ sequence sets.
Regular Section
• Myun Joong HWANG, Doo Yong LEE, Seong Youb CHUNG
Type: PAPER
Subject area: Systems and Control
2008 Volume E91.A Issue 12 Pages 3749-3756
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
This paper presents a motion planning method for a bimanual robot for executing assembly tasks. The method employs an adaptive modeling which can automatically generate an assembly model and modify the model during actual assembly. Bimanual robotic assembly is modeled at the task-level using contact states of workpieces and their transitions. The lower-level velocity commands of the workpieces are automatically derived by solving optimization problem formulated with assembly constraints, position of the workpieces, and kinematics of manipulators. Motion requirements of the workpieces are transformed to motion commands of the bimanual robot. The proposed approach is evaluated with experiments on peg-in-hole assembly with an L-shaped peg.
• Yuichi TANJI, Hideki ASAI, Masayoshi ODA, Yoshifumi NISHIO, Akio USHID ...
Type: PAPER
Subject area: Nonlinear Problems
2008 Volume E91.A Issue 12 Pages 3757-3762
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
A fast time-domain simulation technique of plane circuits via two-layer Cellular Neural Network (CNN) -based modeling, which is necessary for power/signal integrity evaluation in VLSIs, printed circuit boards, and packages, is presented. Using the new notation expressed by the two-layer CNN, 1,553 times faster simulation is achieved, compared with Berkeley SPICE (ngspice). In CNN community, CNNs are generally simulated by explicit numerical integration such as the forward Euler and Runge-Kutta methods. However, since the two-layer CNN is a stiff circuit, we cannot analyze it by using an explicit numerical integration method. Hence, to analyze the two-layer CNN and reduce the computational cost, the leapfrog method is introduced. This procedure would open an application of CNN to electronic design automation area.
• Metin SENGÜL, Siddik B. YARMAN
Type: PAPER
Subject area: Circuit Theory
2008 Volume E91.A Issue 12 Pages 3763-3771
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
In this paper, an alternative approach is presented, to design equalizers (or matching networks) with commensurate (or equal length) transmission lines. The new method automatically yields the matching network topology with characteristic impedances of the commensurate lines. In the implementation process of the new technique first, the driving point impedance data of the matching network is generated by tracing a pre-selected transducer power gain shape, without optimization. Then, it is modelled as a realizable bounded-real input reflection coefficient in Richard domain, which in turn yields the desired equalizer topology with line characteristic impedances. This process results in an excellent initial design for the commercially available computer aided design (CAD) packages to generate final circuit layout for fabrication. An example is given to illustrate the utilization of the new method. It is expected that the proposed design technique is employed as a front-end, to commercially available computer aided design (CAD) packages which generate the actual equalizer circuit layout with physical dimensions for mass production.
• Koji OBATA, Kazuyoshi TAKAGI, Naofumi TAKAGI
Type: PAPER
Subject area: VLSI Design Technology and CAD
2008 Volume E91.A Issue 12 Pages 3772-3782
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
An algorithm for clock scheduling of concurrent-flow clocking rapid single-flux-quantum (RSFQ) digital circuits is proposed. RSFQ circuit technology is an emerging technology of digital circuits. In concurrent-flow clocking RSFQ digital circuits, all logic gates are driven by clock pulses. Appropriate clock scheduling makes clock frequency of the circuits higher. Given a clock period, the proposed algorithm determines the arrival time of clock pulses and the delay that should be inserted. Experimental results show that inserted delay elements by the proposed algorithm are 59.0% fewer and the height of clock trees are 40.4% shorter on average than those by a straightforward algorithm. The proposed algorithm can also be used to minimize the clock period, thus obtaining 19.0% shorter clock periods on average.
• Yanming JIA, Yici CAI, Xianlong HONG
Type: PAPER
Subject area: VLSI Design Technology and CAD
2008 Volume E91.A Issue 12 Pages 3783-3792
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
This paper studies the impact of dummy fill for chemical mechanical polishing (CMP)-induced capacitance variation on buffer insertion based on a virtual CMP fill estimation model. Compared with existing methods, our algorithm is more feasible by performing buffer insertion not in post-process but during early physical design. Our contributions are threefold. First, we introduce an improved fast dummy fill amount estimation algorithm based on [4], and use some speedup techniques (tile merging, fill factor and amount assigning) for early estimation. Second, based on some reasonable assumptions, we present an optimum virtual dummy fill method to estimate dummy position and the effect on the interconnect capacitance. Then the dummy fill estimation model was verified by our experiments. Third, we use this model in early buffer insertion after layer assignment considering the effects of dummy fill. Experimental results verified the necessity of early dummy fill estimation and the validity of our algorithm. Buffer insertion considering dummy fill during early physical design is necessary and our algorithm is promising.
• Shigeru YAMASHITA, Shin-ichi MINATO, D. Michael MILLER
Type: PAPER
Subject area: VLSI Design Technology and CAD
2008 Volume E91.A Issue 12 Pages 3793-3802
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
Recently much attention has been paid to quantum circuit design to prepare for the future “quantum computation era.” Like the conventional logic synthesis, it should be important to verify and analyze the functionalities of generated quantum circuits. For that purpose, we propose an efficient verification method for quantum circuits under a practical restriction. Thanks to the restriction, we can introduce an efficient verification scheme based on decision diagrams called Decision Diagrams for Matrix Functions (DDMFs). Then, we show analytically the advantages of our approach based on DDMFs over the previous verification techniques. In order to introduce DDMFs, we also introduce new concepts, quantum functions and matrix functions, which may also be interesting and useful on their own for designing quantum circuits.
• Yosuke TAKAHASHI, Yukihide KOHIRA, Atsushi TAKAHASHI
Type: PAPER
Subject area: VLSI Design Technology and CAD
2008 Volume E91.A Issue 12 Pages 3803-3811
Published: December 01, 2008
Released: December 25, 2008
JOURNALS RESTRICTED ACCESS
The reduction of the peak power consumption of LSI is required to reduce the instability of gate operation, the delay increase, the noise, and etc. It is possible to reduce the peak power consumption by clock scheduling because it controls the switching timings of registers and combinational logic elements. In this paper, we propose a fast peak power wave estimation method for clock scheduling and fast clock scheduling methods for the peak power reduction. In experiments, it is shown that the peak power wave estimated by the proposed method in a few seconds is highly correlated with the peak power wave obtained by HSPICE simulation in several days. By using the proposed peak power wave estimation method, proposed clock scheduling methods find clock schedules that greatly reduce the peak power consumption in a few minutes.