IPSJ Transactions on System and LSI Design Methodology

Welcome to TSLDM — A New Open-Access Online Journal from IPSJ

Hidetoshi Onodera

Article type: Editorial
Subject area: Editorial
2008Volume 1 Pages 1
Published: 2008
Released on J-STAGE: August 27, 2008

DOIhttps://doi.org/10.2197/ipsjtsldm.1.1

JOURNAL FREE ACCESS

Download PDF (28K)
Trends in Emerging On-Chip Interconnect Technologies

Sudeep Pasricha, Nikil Dutt

Article type: Invited Papers
Subject area: Invited Paper
2008Volume 1 Pages 2-17
Published: 2008
Released on J-STAGE: August 27, 2008

DOIhttps://doi.org/10.2197/ipsjtsldm.1.2

JOURNAL FREE ACCESS

Show abstractHide abstract

In deep submicron (DSM) VLSI technologies, it is becoming increasingly harder for a copper based electrical interconnect fabric to satisfy the multiple design requirements of delay, power, bandwidth, and delay uncertainty. This is because electrical interconnects are becoming increasingly susceptible to parasitic resistance and capacitance with shrinking process technology and rising clock frequencies, which poses serious challenges for interconnect delay, power dissipation and reliability. On-chip communication architectures such as buses and networks-on-chip (NoC) that are used to enable inter-component communication in multi-processor systems-on-chip (MPSoC) designs rely on these electrical interconnects at the physical level, and are consequently faced with the entire gamut of challenges and drawbacks that plague copper-based electrical interconnects. To overcome the limitations of traditional copper-based electrical interconnects, several research efforts have begun looking at novel interconnect alternatives, such as on-chip optical interconnects, wireless interconnects and carbon nanotube-based interconnects. This paper presents an overview and current state of research for these three promising interconnect technologies. We also discuss the existing challenges for each of these technologies that remain to be resolved before they can be adopted as replacements for copper-based electrical interconnects in the future.

View full abstract

Download PDF (942K)
Variability and Statistical Design

Sachin S. Sapatnekar

Article type: Invited Papers
Subject area: Invited Paper
2008Volume 1 Pages 18-32
Published: 2008
Released on J-STAGE: August 27, 2008

DOIhttps://doi.org/10.2197/ipsjtsldm.1.18

JOURNAL FREE ACCESS

Show abstractHide abstract

With each technology generation, the effects of on-chip variations are seen to more profoundly affect digital circuit behavior. These variations may arise from fluctuations attributed to the manufacturing process (e.g., drifts in channel length, oxide thickness, threshold voltage, or doping concentration), which affect the circuit yield, as well as variations in the environmental operating conditions (e.g., supply voltage or temperature) after the circuit is manufactured, which impact the performance of the design. These effects can cause unacceptable alterations in circuit performance parameters such as timing and power, and variation-tolerant design is imperative for next-generation designs. This paper overviews research in this area, describing methods for the analysis and optimization of statistical effects.

View full abstract

Download PDF (817K)
A Simulation-Based Analysis for Worst Case Delay of Single and Multiple Interruptions

Hiroshi Nakashima, Masahiro Konishi, Takashi Nakada

Article type: System-Level Performance Analysis
Subject area: Regular Paper
2008Volume 1 Pages 33-47
Published: 2008
Released on J-STAGE: August 27, 2008

DOIhttps://doi.org/10.2197/ipsjtsldm.1.33

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes an efficient method to analyze the worst case interruption delay (WCID) of a workload running on modern microprocessors using a cycle accurate simulator (CAS). Our method is highly accurate because it simulates all possible cases inserting an interruption just before the retirement of every instruction executed in a workload. It is also (reasonably) efficient because it takes O(N log N) time for a workload with N executed instructions, instead of O(N²) of a straightforward iterative simulation of interrupted executions. The key idea for the efficiency is that a pair of executions with different interruption points has a set of durations in which they behave exactly coherent and thus one of simulations for the durations may be omitted. We implemented this method modifying the SimpleScalar tool set to prove it finds out WCID of workloads with five million executed instructions in reasonable time, less than 30 minutes, which would be 200-300 days by the straightforward method. Furthermore, our CAS-based analyzer may have a post process to calculate the WCID for multiple F interrupts with O(FN√N log N) time complexity.

View full abstract

Download PDF (660K)
Dynamic Power Management for Embedded System Idle State in the Presence of Periodic Interrupt Services

Gang Zeng, Hiroyuki Tomiyama, Hiroaki Takada

Article type: System-Level Low-Power Design
Subject area: Regular Paper
2008Volume 1 Pages 48-57
Published: 2008
Released on J-STAGE: August 27, 2008

DOIhttps://doi.org/10.2197/ipsjtsldm.1.48

JOURNAL FREE ACCESS

Show abstractHide abstract

Generally, there are periodic interrupt services such as periodic clock tick interrupts in the real-time embedded systems even though the system is in the idle state. To minimize the power consumption of idle state, power management therefore should consider the effect of periodic interrupt services. In this paper, we deal with this issue in two different cases. In case the periodic interrupt cannot be disabled, we formulate the power consumption of idle state, and propose static and dynamic approaches for the optimal frequency selection to save idle power. On the other hand, in case the periodic interrupt can be disabled, we propose the configurable clock tick to disable the interrupt service until the next task is released so that the processor can stay in the low power mode for longer time. The proposed approaches are implemented in a real-time OS; and its efficiency has been validated by theoretical calculations and actually measurements on an embedded processor.

View full abstract

Download PDF (976K)
A Study of Multi-core Processor Design with Asynchronous Interconnect Using Synchronous Design Tools

Katsunori Tanaka, Yuichi Nakamura, Atsushi Atarashi

Article type: System-Level Asynchronous Design
Subject area: Regular Paper
2008Volume 1 Pages 58-66
Published: 2008
Released on J-STAGE: August 27, 2008

DOIhttps://doi.org/10.2197/ipsjtsldm.1.58

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper presents a study of GALS (Globally-Asynchronous Locally-Synchronous) architecture multi-core processor design with asynchronous interconnects. While GALS is expected to reduce more power dissipation, it has not been the mainstream of LSI design yet, since there have been no mature design tools for asynchronous circuit design. For GALS design, we constructed a design flow based on general synchronous design tools, by specification of design constraints and configurations. Applying the design flow to an experimental multi-core processor GALS design including an asynchronous interconnect based on QDI (Quasi Delay Insensitive) model, we successfully obtained a netlist and layout, and proved that the flow works correctly, by netlist simulation with delay information back-annotated from the layout. Experimental results show the area, power and throughput of the asynchronous interconnect to indicate the impact by introducing GALS architecture instead of globally synchronous design.

View full abstract

Download PDF (442K)
A Synthesis Method of General Floating-Point Arithmetic Units by Aligned Partition

Liangwei Ge, Song Chen, Yuichi Nakamura, Takeshi Yoshimura

Article type: Arithmetic Synthesis
Subject area: Regular Paper
2008Volume 1 Pages 67-77
Published: 2008
Released on J-STAGE: August 27, 2008

DOIhttps://doi.org/10.2197/ipsjtsldm.1.67

JOURNAL FREE ACCESS

Show abstractHide abstract

Since many embedded applications involve intensive mathematic operations, floating-point arithmetic units (FPU) have paramount importance in embedded systems. However, previous implementations of FPU either require much manual work or only support special functions (e.g. reciprocal, square root, logarithm, etc.). In this paper, we present an automatic method to synthesize general FPU by aligned partition. Based on the novel partition algorithm and the concept of grouping floating-point numbers into zones, our method supports general functions of wide, irreducible domain. The synthesized FPU achieves smaller area, higher frequency, and greater accuracy. Experimental results show that our method achieves 1) on average 90% smaller and 50% faster indexer than the conventional automatic method; 2) on the hyperbolic functions, 20k times smaller error rate and 50% use of LUTs and flip-flops than the conventional manual design.

View full abstract

Download PDF (1861K)
Floorplan-Driven High-Level Synthesis for Distributed/Shared-Register Architectures

Akira Ohchi, Shunitsu Kohara, Nozomu Togawa, Masao Yanagisawa, Tatsuo ...

Article type: Behavioral Synthesis
Subject area: Regular Paper
2008Volume 1 Pages 78-90
Published: 2008
Released on J-STAGE: August 27, 2008

DOIhttps://doi.org/10.2197/ipsjtsldm.1.78

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we propose a high-level synthesis method targeting distributed/shared-register architectures. Our method repeats (1) scheduling/FU binding, (2) register allocation, (3) register binding, and (4) module placement. By feeding back floorplan information from (4) to (1), our method obtains a distributed/shared-register architecture where its scheduling/binding as well as floorplaning are simultaneously optimized. Experimental results show that the area is decreased by 13.2% while maintaining the performance of the circuit equal with that using distributed-register architectures.

View full abstract

Download PDF (526K)
Two-Stage Stuck-at Fault Test Data Compression Using Scan Flip-Flops with Delay Fault Testability

Kentaroh Katoh, Kazuteru Namba, Hideo Ito

Article type: Delay Testing
Subject area: Regular Paper
2008Volume 1 Pages 91-103
Published: 2008
Released on J-STAGE: August 27, 2008

DOIhttps://doi.org/10.2197/ipsjtsldm.1.91

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper presents a stuck-at fault test data compression using the scan flip flops with delay fault testability namely the Chiba scan flip-flops. The feature of the proposed method is two-stage test data compression. First, test data is compressed utilizing the structure of the Chiba scan flip flops (the first stage compression). Second, the compressed test data is further compressed by conventional test data compression utilizing X bits (the second stage compression). Evaluation shows that when Huffman test data compression is used in the second stage compression, the volume of test data for the proposed test data compression in ATE is reduced 35.8% in maximum, 25.7% on average of the one of the test data compressed by the conventional method. The difference of the area overhead of the proposed method from the conventional method is 9.5 percent point.

View full abstract

Download PDF (921K)
Estimation of Delay Test Quality and Its Application to Test Generation

Seiji Kajihara, Shohei Morishima, Masahiro Yamamoto, Xiaoqing Wen, Mas ...

Article type: Delay Testing
Subject area: Regular Paper
2008Volume 1 Pages 104-115
Published: 2008
Released on J-STAGE: August 27, 2008

DOIhttps://doi.org/10.2197/ipsjtsldm.1.104

JOURNAL FREE ACCESS

Show abstractHide abstract

As a method to evaluate delay test quality of test patterns, SDQM (Statistical Delay Quality Model) has been proposed for transition faults. In order to derive better test quality by SDQM, the following two things are important: for each transition fault, (1) to find out the accurate length of the longest sensitizable paths along which the fault is activated and propagated, and (2) to generate a test pattern that detects the fault through as long paths as possible. In this paper, we propose a method to calculate the length of the potentially sensitizable longest path for detection of a transition fault. In addition, we develop a procedure to extract path information that helps high quality transition ATPG. Experimental results show that the proposed method improves SDQL (Statistical Delay Quality Level) by not only accurate calculation of the longest sensitizable paths but also detection of faults through longer paths.

View full abstract

Download PDF (373K)
Accurate Estimation of the Worst-case Delay in Statistical Static Timing Analysis

Haruhiko Terada, Takayuki Fukuoka, Akira Tsuchiya, Hidetoshi Onodera

Article type: Statistical Static Timing Analysis
Subject area: Regular Paper
2008Volume 1 Pages 116-125
Published: 2008
Released on J-STAGE: August 27, 2008

DOIhttps://doi.org/10.2197/ipsjtsldm.1.116

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we propose an approximation method for the statistical MAX operation such that it results in a normal distribution good for the worst-case delay analysis. The important operation in SSTA is SUM and MAX of distributions. In general, the delay variation is modeled as normal distribution. The result of SUM operation of two normal distributions is also normal distribution. On the other hand, the result of MAX operation is not normal distribution. Thus approximation to normal distribution is commonly used. We also explain that the proposed MAX operation at each gate also contributes to the accurate estimation in the worst-case delay analysis of the whole circuit. Experimental results show that the proposed method leads to a good approximation for a normal distribution resulted from MAX operation of normal distributions with and without correlation, and the approximation improves the accuracy of the worst-case delay analysis. In a circuit example, the errors of worst-case delay computed by the previous method are about 20%, and the errors computed by the proposed method are under 5%.

View full abstract

Download PDF (417K)
Embedded System Covalidation with RTOS Model and FPGA

Seiya Shibata, Shinya Honda, Yuko Hara, Hiroyuki Tomiyama, Hiroaki Tak ...

Article type: System-Level Verification
Subject area: Short Paper
2008Volume 1 Pages 126-130
Published: 2008
Released on J-STAGE: August 27, 2008

DOIhttps://doi.org/10.2197/ipsjtsldm.1.126

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper presents a software/hardware covalidation environment for embedded systems. Our covalidation environment consists of a simulation model of RTOS which fully supports services of ITRON, multiple hardware simulators, FPGA and a covalidation backplane. All of the simulators are executed concurrently with communication. The RTOS model can be executed on the host computer natively, therefore the software can be simulated much faster than on an instruction set simulator. FPGA can execute the hardware much faster than HDL simulators. With the RTOS model and FPGA, both application software and hardware can be validated in a short time. In the experiment, with using our covalidation environment, we perform covalidation of an MPEG4 decoder system and show the effectiveness of the covalidation environment.

View full abstract

Download PDF (637K)

Register with J-STAGE for free!