IEICE Transactions on Electronics
Online ISSN : 1745-1353
Print ISSN : 0916-8524
Volume E95.C, Issue 4
Displaying 1-44 of 44 articles from this issue
Special Section on Solid-State Circuit Design - Architecture, Circuit, Device and Design Methodology
  • Masahiko YOSHIMOTO
    2012 Volume E95.C Issue 4 Pages 413
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Download PDF (96K)
  • Kiyoshi TAKEUCHI
    Article type: INVITED PAPER
    2012 Volume E95.C Issue 4 Pages 414-420
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    As MOS transistors are scaled down, the impact of randomly placed discrete charge (impurity atoms, traps and surface states) on device characteristics rapidly increases. Significant variability caused by random dopant fluctuation (RDF) is a direct result of this, which urges the adoption of new device architectures (ultra-thin body SOI FETs and FinFETs) which do not use impurity for body doping. Variability caused by traps and surface states, such as random telegraph noise (RTN), though less significant than RDF today, will soon be a major problem. The increased complexity of such residual-charge-induced variability due to non-Gaussian and time-dependent behavior will necessitate new approaches for variation-aware design.
    Download PDF (925K)
  • Shiro DOSHO
    Article type: INVITED PAPER
    2012 Volume E95.C Issue 4 Pages 421-431
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Along with the miniaturization of CMOS-LSIs, control methods for LSIs have been extensively developed. The most predominant method is to digitize observed values as early as possible and to use digital control. Thus, many types of analog-to-digital converters (ADCs) have been developed such as temperature, time, delay, and frequency converters. ADCs are the easiest circuits into which digital correction methods can be introduced because their outputs are digital. Various types of calibration method have been developed, which has markedly improved the figure of merits by alleviating margins for device variations. The above calibration and correction methods not only overcome a circuit's weak points but also give us the chance to develop quite new circuit topologies and systems. In this paper, several digital calibration and correction methods for major analog-to-digital converters are described, such as pipelined ADCs, delta-sigma ADCs, and successive approximation ADCs.
    Download PDF (1837K)
  • Koyo NITTA, Hiroe IWASAKI, Takayuki ONISHI, Takashi SANO, Atsushi SAGA ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 432-440
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    An H.264/AVC encoder LSI (named “SARA”) that supports High422 profile, as well as 422 profile of MPEG-2, has been developed for HDTV broadcasting infrastructures. It contains three motion estimation and compensation (ME/MC) engines with wide search ranges of -217.75 to +199.75 horizontally, -109.75 to +145.75 vertically, which can utilize almost all H.264/AVC ME/MC coding tools, such as multiple reference frame, variable block size, quarter-pel prediction, macroblock adaptive field/frame prediction (MBAFF), spatial/temporal direct mode, and weighted prediction. Our evaluations show that it can encode fast moving scenes with 1.2dB to 1.7dB higher than the JM. It was successfully fabricated in a 90-nm technology, and integrates 140 million transistors.
    Download PDF (4195K)
  • Weiwei SHEN, Yibo FAN, Xiaoyang ZENG
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 441-446
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    In this paper, a high-throughput debloking filter is presented for H.264/AVC standard, catering video applications with 4K × 2K (4096 × 2304) ultra-definition resolution. In order to strengthen the parallelism without simply increasing the area, we propose a luma-chroma parallel method. Meanwhile, this work reduces the number of processing cycles, the amount of external memory traffic and the working frequency, by using triple four-stage pipeline filters and a luma-chroma interlaced sequence. Furthermore, it eliminates most unnecessary off-chip memory bandwidth with a highly reusable memory scheme, and adopts a “slide window” buffer scheme. As a result, our design can support 4K × 2K at 30fps applications at the working frequency of only 70.8MHz.
    Download PDF (1865K)
  • Yibo FAN, Jialiang LIU, Dexue ZHANG, Xiaoyang ZENG, Xinhua CHEN
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 447-455
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Fidelity Range Extension (FRExt) (i.e. High Profile) was added to the H.264/AVC recommendation in the second version. One of the features included in FRExt is the Adaptive Block-size Transform (ABT). In order to conform to the FRExt, a Fractional Motion Estimation (FME) architecture is proposed to support the 8×8/4×4 adaptive Hadamard Transform (8×8/4×4 AHT). The 8×8/4×4 AHT circuit contributes to higher throughput and encoding performance. In order to increase the utilization of SATD (Sum of Absolute Transformed Difference) Generator (SG) in unit time, the proposed architecture employs two 8-pel interpolators (IP) to time-share one SG. These two IPs can work in turn to provide the available data continuously to the SG, which increases the data throughput and significantly reduces the cycles that are needed to process one Macroblock. Furthermore, this architecture also exploits the linear feature of Hadamard Transform to generate the quarter-pel SATD. This method could help to shorten the long datapath in the second-step of two-iteration FME algorithm. Finally, experimental results show that this architecture could be used in the applications requiring different performances by adjusting the supported modes and operation frequency. It can support the real-time encoding of the seven-mode 4K×2K@24fps or six-mode 4K×2K@30fps video sequences.
    Download PDF (2140K)
  • Kazuhiro NAKAMURA, Ryo SHIMAZAKI, Masatoshi YAMAMOTO, Kazuyoshi TAKAGI ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 456-467
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    This paper presents a memory-efficient VLSI architecture for output probability computations (OPCs) of continuous hidden Markov models (HMMs) and likelihood score computations (LSCs). These computations are the most time consuming part of HMM-based isolated word recognition systems. We demonstrate multiple fast store-based block parallel processing (MultipleFastStoreBPP) for OPCs and LSCs and present a VLSI architecture that supports it. Compared with conventional fast store-based block parallel processing (FastStoreBPP) and stream-based block parallel processing (StreamBPP) architectures, the proposed architecture requires fewer registers and less processing time. The processing elements (PEs) used in the FastStoreBPP and StreamBPP architectures are identical to those used in the MultipleFastStoreBPP architecture. From a VLSI architectural viewpoint, a comparison shows that the proposed architecture is an improvement over the others, through efficient use of PEs and registers for storing input feature vectors.
    Download PDF (2316K)
  • Mitsuru SHIOZAKI, Kota FURUHASHI, Takahiko MURAYAMA, Akitaka FUKUSHIMA ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 468-477
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Silicon Physical Unclonable Functions (PUFs) have been proposed to exploit inherent characteristics caused by process variations, such as transistor size, threshold voltage and so on, and to produce an inexpensive and tamper-resistant device such as IC identification, authentication and key generation. We have focused on the arbiter-PUF utilizing the relative delay-time difference between the equivalent paths. The conventional arbiter-PUF has a technical issue, which is low uniqueness caused by the ununiformity on response-generation. To enhance the uniqueness, a novel arbiter-based PUF utilizing the Response Generation according to the Delay Time Measurement (RG-DTM) scheme, has been proposed. In the conventional arbiter-PUF, the response 0 or 1 is assigned according to the single threshold of relative delay-time difference. On the contrary, the response 0 or 1 is assigned according to the multiple threshold of relative delay-time difference in the RG-DTM PUF. The conventional and RG-DTM PUF were designed and fabricated with 0.18µm CMOS technology. The Hamming distances (HDs) between different chips, which indicate the uniqueness, were calculated by 256-bit responses from the identical challenges on each chip. The ideal distribution of HDs, which indicates high uniqueness, is achieved in the RG-DTM PUF using 16 thresholds of relative delay-time differences. The generative stability, which is the fluctuation of responses in the same environment, and the environmental stability, which is the changes of responses in the different environment were also evaluated. There is a trade-off between high uniqueness and high stability, however, the experimental data shows that the RG-DTM PUF has extremely smaller false matching probability in the identification compared to the conventional PUF.
    Download PDF (2244K)
  • Changsheng ZHOU, Yuebin HUANG, Shuangqu HUANG, Yun CHEN, Xiaoyang ZENG
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 478-486
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Based on Turbo-Decoding Message-Passing (TDMP) and Normalized Min-Sum (NMS) algorithm, an area efficient LDPC decoder that supports both structured and unstructured LDPC codes is proposed in this paper. We introduce a solution to solve the memory access conflict problem caused by TDMP algorithm. We also arrange the main timing schedule carefully to handle the operations of our solution while avoiding much additional hardware consumption. To reduce the memory bits needed, the extrinsic message storing strategy is also optimized. Besides the extrinsic message recover and the accumulate operation are merged together. To verify our architecture, a LDPC decoder that supports both China Multimedia Mobile Broadcasting (CMMB) and Digital Terrestrial/ Television Multimedia Broadcasting (DTMB) standards is developed using SMIC 0.13µm standard CMOS process. The core area is 4.75mm2 and the maximum operating clock frequency is 200MHz. The estimated power consumption is 48.4mW at 25MHz for CMMB and 130.9mW at 50MHz for DTMB with 5 iterations and 1.2V supply.
    Download PDF (1879K)
  • Hirofumi IWATO, Keishi SAKANUSHI, Yoshinori TAKEUCHI, Masaharu IMAI
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 487-494
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    To measure the detrusor pressure for diagnosing lower urinary tract symptoms, we designed a small-area and low-power System on a Chip (SoC). The SoC should be small and low power because it is encapsulated in tiny air-tight capsules which are simultaneously inserted in the urinary bladder and rectum for several days. Since the SoC is also required to be programmable, we designed an Application Specific Instruction set Processor (ASIP) for pressure measurement and wireless communication, and implemented almost required functions on the ASIP. The SoC was fabricated using a 0.18µm CMOS mixed-signal process and the chip size is 2.5×2.5mm2. Evaluation results show that the power consumption of the SoC is 93.5µW, and that it can operate the capsule for seven days with a tiny battery.
    Download PDF (1287K)
  • Shouyi YIN, Yang HU, Zhen ZHANG, Leibo LIU, Shaojun WEI
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 495-505
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Hybrid wired/wireless on-chip network is a promising communication architecture for multi-/many-core SoC. For application-specific SoC design, it is important to design a dedicated on-chip network architecture according to the application-specific nature. In this paper, we propose a heuristic wireless link allocation algorithm for creating hybrid on-chip network architecture. The algorithm can eliminate the performance bottleneck by replacing multi-hop wired paths by high-bandwidth single-hop long-range wireless links. The simulation results show that the hybrid on-chip network designed by our algorithm improves the performance in terms of both communication delay and energy consumption significantly.
    Download PDF (719K)
  • Naohiro HAMADA, Hiroshi SAITO
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 506-515
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    In this paper, we propose a synthesis method for asynchronous circuits with bundled-data implementation. The proposed method iteratively applies behavioral synthesis and floorplanning to obtain a near optimum circuit in the term of latency under given design constraints. To improve latency, behavioral synthesis and floorplanning are carried out so that the delay of the control circuit is minimized and the addition of delay elements to satisfy timing constraints is minimized. We evaluate the effectiveness of the proposed method in terms of latency, area, and the number of timing violations while synthesizing several benchmarks. Experimental results show that the proposed method synthesizes faster circuits compared to the circuit synthesized without the proposed method. Also, the proposed method is effective to reduce the number of timing violations.
    Download PDF (996K)
  • Jung-Lin YANG, Shin-Nung LU, Pei-Hsuan YU
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 516-522
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Developing a rapid prototyping environment utilizing hardware description languages (HDLs) and conventional FPGAs can help ease and conquer the difficulties caused by the complexity of asynchronous digital systems and the advance of VLSI technology recently. We proposed a design flow and a FPGA template for implementing generalized C-element (gC) style asynchronous controllers. Utilizing conventional FPGA synthesis tools, self-timed bundled-data function modules can be realized with some effort on timing validation. The proposed design flow with FPGA-based realization approach is a very effective design methodology for rapid prototyping and functionality validation. This work could be useful for the early stage of performance estimation, power reduction exploration, circuits design training, and many other applications regarded asynchronous circuits. In this paper, the proposed FPGA-based asynchronous circuit design flow, a hands-on design tutorial, a generalized C-element template, and a list of synthesized benchmark circuits are documented and discussed in detail.
    Download PDF (922K)
  • Yohei NAKATA, Hiroshi KAWAGUCHI, Masahiko YOSHIMOTO
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 523-533
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    As process technology is scaled down, a typical system on a chip (SoC) becomes denser. In scaled process technology, process variation becomes greater and increasingly affects the SoC circuits. Moreover, the process variation strongly affects network-on-chips (NoCs) that have a synchronous network across the chip. Therefore, its network frequency is degraded. We propose a process-variation-adaptive NoC with a variation-adaptive variable-cycle router (VAVCR). The proposed VAVCR can configure its cycle latency adaptively on a processor core basis, corresponding to the process variation. It can increase the network frequency, which is limited by the process variation in a conventional router. Furthermore, we propose a variable-cycle pipeline adaptive routing (VCPAR) method with VAVCR; the proposed VCPAR can reduce packet latency and has tolerance to network congestion. The total execution time reduction of the proposed VAVCR with VCPAR is 15.7%, on average, for five task graphs.
    Download PDF (2221K)
  • Wei ZHONG, Takeshi YOSHIMURA, Bei YU, Song CHEN, Sheqin DONG, Satoshi ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 534-545
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Network-on-Chips (NoCs) have been proposed as a solution for addressing the global communication challenges in System-on-Chip (SoC) architectures that are implemented in nanoscale technologies. For the use of NoCs to be feasible in today's industrial designs, a custom-tailored, power- efficient NoC topology that satisfies the application characteristics is required. In this work, we present a design methodology that automates the synthesis of such application-specific NoC topologies. We present a method which integrates partitioning into floorplanning phase to explore optimal clustering of cores during floorplanning with minimized link and switch power consumption. Based on the size of applications, we also present an Integer Linear Programming and a heuristic method to place switches and network interfaces on the floorplan. Then, a power and timing aware path allocation algorithm is carried out to determine the connectivity across different switches. We perform experiments on several SoC benchmarks and present a comparison with the latest work. For small applications, the NoC topologies synthesized by our method show large improvements in power consumption (27.54%), hop-count (4%) and running time (66%) on average. And for large applications, the synthesized topologies result in large power (31.77%), hop-count (29%) and running time (94.18%) on average.
    Download PDF (2020K)
  • Benjamin DEVLIN, Makoto IKEDA, Kunihiro ASADA
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 546-554
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    A 65nm self synchronous field programmable gate array (SSFPGA) which uses autonomous gate-level power gating with minimal control circuitry overhead for energy minimum operation is presented. The use of self synchronous signaling allows the FPGA to operate at voltages down to 370mV without any parameter tuning. We show both 2.6x total energy reduction and 6.4x performance improvement at the same time for energy minimum operation compared to the non-power gated SSFPGA, and compared to the latest research 1.8x improvement in power-delay product (PDP) and 2x performance improvement. When compared to a synchronous FPGA in a similar process we are able to show up to 84.6x PDP improvement. We also show energy minimum operation for maximum throughput on the power gated SSFPGA is achieved at 0.6V, 27fJ/operation at 264MHz.
    Download PDF (4299K)
  • Akira KOTABE, Kiyoo ITOH, Riichiro TAKEMURA
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 555-563
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    It is shown that it is feasible to apply 0.5-V 6-T SRAM cells in a 25-nm high-speed 1-Gb e-SRAM. In particular, for coping with rapidly reduced voltage margin as VDD is reduced, a boosted word-voltage scheme is first proposed. Second, Vt variations are reduced with repair techniques and nanoscale FD-MOSFETs to further widen the voltage margin. Third, a worst case design is developed, for the first time, to evaluate the cell. This design features a dynamic margin analysis and takes subthreshold current, temperature, and Vt variations and their combination in the cell into account. Fourth, the proposed scheme is evaluated by applying the worst-case design and a 25-nm planar FD-SOI MOSFET. It is consequently found that the scheme provides a wide margin and high speed even at 0.5V. A 0.5-V high-speed 25-nm 1-Gb SRAM is thus feasible. Finally, to further improve the scheme, it is shown that it is necessary to use FinFETs and suppress and compensate process, voltage, and temperature variations in a chip and wafer.
    Download PDF (1855K)
  • Kousuke MIYAJI, Kentaro HONDA, Shuhei TANAKAMARU, Shinji MIYANO, Ken T ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 564-571
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Three types of electron injection scheme: both side injection scheme and self-repair one side injection scheme Type A (injection for once) and Type B (injection for twice) are proposed and analyzed comprehensively for 65nm technology node 6T- and 8T-SRAM cells to find the optimum injection scheme and cell architecture. It is found that the read speed degrades by as much as 6.3 times in the 6T-SRAM with the local injected electrons. However, the read speed of the 8T-SRAM cell does not degrade because the read port is separated from the write pass gate transistors. Furthermore, the self-repair one side injection scheme is most suitable to solve the conflict of the half select disturb and write characteristics. The worst cell characteristics of Type A and Type B self-repair one side injection schemes were found to be the same. In the self-repair one side injection 8T-SRAM, the disturb margin increases by 141% without write margin or read speed degradation. The proposed schemes have no process or area penalty compared with the standard CMOS-process.
    Download PDF (4594K)
  • Shusuke YOSHIMOTO, Masaharu TERADA, Shunsuke OKUMURA, Toshikazu SUZUKI ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 572-578
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    This paper presents a novel disturb mitigation scheme which achieves low-energy operation for a deep sub-micron 8T SRAM macro. The classic write-back scheme with a dedicated read port overcame both half-select and read-disturb problems. Moreover, it improved the yield, particularly in the low-voltage range. The conventional scheme, however, consumed more power because of charging and discharging all write bitlines in a sub-block. Our proposed scheme reduces the power overhead of the write-back scheme using a floating write bitline technique and a low-swing bitline driver (LSBD). The floating bitline and the LSBD respectively consist of a precharge-less CMOS equalizer (transmission gate) and an nMOS write-back driver. The voltage on the floating write bitline is at an intermediate voltage between the ground and the supply voltage before a write cycle. The write target cells are written by normal CMOS drivers, whereas the write bitlines in half-selected columns are driven by the LSBDs in the write cycle, which suppresses the write bitline voltage to VDD - Vtn and therefore saves the active power in the half-selected columns (where Vtn is a threshold voltage of an nMOS). In addition, the proposed scheme reduces a leakage current from the write bitline because of the floating write bitline. The active leakage is reduced by 33% at the FF corner, 125°C. The active energy in the write operation is reduced by 37% at the FF corner. In other process corners, more writing power reduction can be expected because it depends on the Vtn in the LSBD. We fabricated a 512-Kb 8T SRAM test chip that operates at a single 0.5-V supply voltage. The test chip with the proposed scheme respectively achieves 1.52-µW/MHz writing energy and 72.8-µW leakage power, which are 59.4% and 26.0% better than those of the conventional write-back scheme. The total energy is 12.9 µW/MHz (12.9 pJ/access) at a supply voltage of 0.5V and operating frequency of 6.25MHz in a 50%-read/50%-write operation.
    Download PDF (1914K)
  • Shunsuke OKUMURA, Hidehiro FUJIWARA, Kosuke YAMAGUCHI, Shusuke YOSHIMO ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 579-585
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    We propose a novel substrate-bias control scheme for an FD-SOI SRAM that suppresses inter-die variability. The proposed circuits detect inter-die threshold-voltage variation automatically, and then maximize read/write margins of memory cells to supply the substrate bias. We confirmed that a 486-kb 6T SRAM operates at 0.42V, in which an FS corner can be compared as much as 0.14V or more.
    Download PDF (1888K)
  • Takuya SAWADA, Taku TOSHIKAWA, Kumpei YOSHIKAWA, Hidehiro TAKATA, Koji ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 586-593
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    The susceptibility of a static random access memory (SRAM) core against static and dynamic variation of power supply voltage is evaluated, by using on-chip diagnosis structures of memory built-in self testing (MBIST) and on-chip voltage waveform monitoring (OCM). The SRAM core of interest in this paper is a synthesizable version applicable to general systems-on-a-chip (SoC) design, and fabricated in a 90nm CMOS technology. RF power injection to power supply networks is quantified by OCM. The number of resultant erroneous bits as well as their distribution in the cell array is given by MBIST. The frequency-dependent sensitivity reflects the highly capacitive nature of densely integrated SRAM cells.
    Download PDF (3067K)
  • Akira KOTABE, Riichiro TAKEMURA, Yoshimitsu YANAGAWA, Tomonori SEKIGUC ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 594-599
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    A small-sized leakage-controlled gated sense amplifier (SA) and relevant circuits are proposed for 0.5-V multi-gigabit DRAM arrays. The proposed SA consists of a high-VT PMOS amplifier and a low-VT NMOS amplifier which is composed of high-VT NMOSs and a low-VT cross-coupled NMOS, and achieves 46% area reduction compared to a conventional SA with a low-VT CMOS preamplifier. Separation of the proposed SA and a data-line pair achieves a sensing time of 6ns and a writing time of 0.6ns. Momentarily overdriving the PMOS amplifier achieves a restoring time of 13ns. The gate level control of the high-VT NMOSs and the gate level compensation circuit for PVT variations reduce the leakage current of the proposed SA to 2% of that without the control, and its effectiveness was confirmed using a 50-nm test chip.
    Download PDF (2369K)
  • Satoru AKIYAMA, Riichiro TAKEMURA, Tomonori SEKIGUCHI, Akira KOTABE, K ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 600-608
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    A gated sense amplifier (GSA) consisting of a low-Vt gated preamplifier (LGA) and a high-Vt sense amplifier (SA) is proposed. The gating scheme of the LGA enables quick amplification of an initial cell signal voltage (vS0) because of its low Vt and prevents the cell signal from degrading due to interference noise between data lines. As for a conventional sense amplifier (CSA), this new type of noise causes sensing error, and the noise-generation mechanism was clarified for the first time by analysis of vS0. The high-Vt SA holds the amplified signal and keeps subthreshold current low. Moreover, the gating scheme of the low-Vt MOSFETs in the LGA drives the I/O line quickly. The GSA thus simultaneously achieves fast sensing, low-leakage data holding, and fast I/O driving, even for sub-1-V mid-point sensing. The GSA is promising for future sub-1-V gigabit dynamic random-access memory (DRAM) because of reduced variations in the threshold voltage of MOSFETs; thus, the offset voltage of the LGA is reduced. The effectiveness of the GSA was verified with a 70-nm 512-Mbit DRAM chip. It demonstrated row access time (tRCD) of 16.4ns and read access (tAA) of 14.3ns at array voltage of 0.9V.
    Download PDF (2066K)
  • Kousuke MIYAJI, Ryoji YAJIMA, Teruyoshi HATANAKA, Mitsue TAKAHASHI, Sh ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 609-616
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Initialize and weak-program erasing scheme is proposed to achieve high-performance and high-reliability Ferroelectric (Fe-) NAND flash solid-state drive (SSD). Bit-by-bit erase VTH control is achieved by the proposed erasing scheme and history effects in Fe-NAND is also suppressed. History effects change the future erase VTH shift characteristics by the past program voltage. The proposed erasing scheme decreases VTH shift variation due to history effects from ±40% to ±2% and the erase VTH distribution width is reduced from over 0.4V to 0.045V. As a result, the read and VPASS disturbance decrease by 42% and 37%, respectively. The proposed erasing scheme is immune to VTH variations and voltage stress. The proposed erasing scheme also suppresses the power and bandwidth degradation of SSD.
    Download PDF (1879K)
  • Hyoungjun NA, Tetsuo ENDOH
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 617-626
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    In this paper, a theoretical analysis of current-controlled (CC-) MOS current mode logic (MCML) is reported. Furthermore, the circuit performance of the CC-MCML with the auto-detection of threshold voltage (Vth) fluctuation is evaluated. The proposed CC-MCML with the auto-detection of Vth fluctuation automatically suppresses the degradation of circuit performance induced by the Vth fluctuations of the transistors automatically, by detecting these fluctuations. When a Vth fluctuation of ±0.1V occurs on the circuit, the cutoff frequency of the circuit is increased from 0Hz to 3.5GHz by using the proposed CC-MCML with the auto-detection of Vth fluctuation.
    Download PDF (1960K)
  • Tetsuya IIZUKA, Kunihiro ASADA
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 627-634
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    This paper proposes an all-digital process variability monitor based on a shared structure of a buffer ring and a ring oscillator. The proposed circuit monitors the PMOS and NMOS process variabilities independently according to a count number of a single pulse which propagates on the ring during the buffer ring mode, and an oscillation period during the ring oscillator mode. Using this shared-ring structure, we reduce the occupation area about 40% without loss of process variability monitoring properties compared with the conventional circuit. The proposed shared-ring circuit has been fabricated in 65nm CMOS process and the measurement results with two different wafer lots show the feasibility of the proposed process variability monitoring scheme.
    Download PDF (2302K)
  • Hiroki YABE, Makoto IKEDA
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 635-642
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    We present a 3-D range map acquisition system using a gray-encoded time-multiplexing structured pattern. In this method the only information needed to reconstruct 3-D range map is whether the pixel is bright or not for the exposed structured patterns. A dedicated image sensor to capture the pattern consists of pixel parallel 1-bit A/D converter, in-pixel pattern address memory and column parallel digital pattern address readout circuit. This in-pixel memory and digital bit-parallel pattern address readout eliminate unnecessary readout of pattern data to enhance 3-D acquisition speed. We fabricated the image sensor in 0.18µm CMOS and demonstrated up to 122 range map per second 3-D range map acquisition performance for 7 patterns with the average error of 3.2mm under the condition of 10% pattern recognition error.
    Download PDF (2451K)
  • Jinmyoung KIM, Toru NAKURA, Hidehiro TAKATA, Koichiro ISHIBASHI, Makot ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 643-650
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Switched parasitic capacitors of sleep blocks with a tri-mode power gating structure are implemented to reduce on-chip resonant supply noise in 1.2V, 65nm standard CMOS process. The tri-mode power gating structure makes it possible to store charge into the parasitic capacitance of the power gated blocks. The proposed method achieves 53.1% and 57.9% noise reduction for wake-up noise and 130MHz periodic supply noise, respectively. It also realizes noise cancelling without discharging time before using parasitic capacitors of sleep blocks, and shows 8.4x boost of the effective capacitance value with 2.1% chip area overhead. The proposed method can save the chip area for reducing resonant supply noise more effectively.
    Download PDF (2365K)
  • Kazuo ONO, Yoshimitsu YANAGAWA, Akira KOTABE, Riichiro TAKEMURA, Tatsu ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 651-660
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    A charge-integration read scheme has been developed for a solid-nanopore DNA-sequencer that determines a genome by direct and electrical measurements of transverse tunneling current in single-stranded DNA. The magnitude of the current was simulated with a first-principles molecular dynamics method. It was found that the magnitude is as small as in the sub-pico ampere range, and signals from four bases represent wide distributions with overlaps between each base. The distribution is believed to originate with translational and rotational motion of DNA in a nanopore with a frequency of over 105Hz. A sequence scheme is presented to distinguish the distributed signals. The scheme makes widely distributed signals time-integrated convergent by cumulating charge at the capacitance of a nanopore device and read circuits. We estimated that an integration time of 1.4ms is sufficient to obtain a signal difference of over 10mV for distinguishing between each DNA base. Moreover, the time is shortened if paired bases, such as A-T and C-G in double-stranded DNA, can be measured simultaneously with two nanopores. Circuit simulations, which included the capacitance of a nanopore calculated with a device simulator, successfully distinguished between DNA bases in less than 2.0ms. The speed is roughly six orders faster than that of a conventional DNA sequencer. It is possible to determine the human genome in one day if 100-nanopores are operated in parallel.
    Download PDF (3559K)
  • Tetsuya IIZUKA, Satoshi MIURA, Ryota YAMAMOTO, Yutaka CHIBA, Shunichi ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 661-667
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    This paper proposes a sub-ps resolution TDC utilizing a differential pulse-shrinking buffer ring. This scheme uses two differentially-operated pulse-shrinking inverters and the TDC resolution is finely controlled by the transistor size ratio between them. The proposed TDC realizes 9bit, 580fs resolution in a 0.18µm CMOS technology with 0.04mm2 area, and achieves DNL and INL of +0.8/-0.8LSB and +4.3/-4.0LSB, respectively, without linearity calibration. A power dissipation at 1.5MS/s ranges from 10.8 to 12.6mW depending on the input time intervals.
    Download PDF (4055K)
  • Andrzej RADECKI, Hayun CHUNG, Yoichi YOSHIDA, Noriyuki MIURA, Tsunaaki ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 668-676
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Wafer-level testing is a well established solution for detecting manufacturing errors and removing non-functional devices early in the fabrication process. Recently this technique has been facing a number of challenges, resulting from increased complexity of devices under test, larger number and higher density of pads or bumps, application of mechanically fragile materials, such as low-k dielectrics, and ever developing packaging technologies. Most of these difficulties originate from the use of mechanical probes, as they limit testing speed, impose performance limitations and add reliability issues. Earlier work focused on relaxing these constraints by removing mechanical probes for data transmission and DC signal measurement and replacing them with non-contact interfaces. In this paper we extend this concept by adding a capability of transferring power wirelessly, enabling non-contact wafer-level testing. In addition to further improvements in the performance and reliability, this solution enables new testing scenarios such as probing wafers from their backside. The proposed system achieves 6W/25mm2 power transfer density over a distance of up to 0.32mm, making it suitable for non-contact wafer-level testing of medium performance CMOS integrated circuits.
    Download PDF (4430K)
  • Toru SAI, Yasuhiro SUGIMOTO
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 677-685
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    By using a quadratic compensation slope, a CMOS current-mode buck DC-DC converter with constant frequency characteristics over wide input and output voltage ranges has been developed. The use of a quadratic slope instead of a conventional linear slope makes both the damping factor in the transfer function and the frequency bandwidth of the current feedback loop independent of the converter's output voltage settings. When the coefficient of the quadratic slope is chosen to be dependent on the input voltage settings, the damping factor in the transfer function and the frequency bandwidth of the current feedback loop both become independent of the input voltage settings. Thus, both the input and output voltage dependences in the current feedback loop are eliminated, the frequency characteristics become constant, and the frequency bandwidth is maximized. To verify the effectiveness of a quadratic compensation slope with a coefficient that is dependent on the input voltage in a buck DC-DC converter, we fabricated a test chip using a 0.18µm high-voltage CMOS process. The evaluation results show that the frequency characteristics of both the total feedback loop and the current feedback loop are constant even when the input and output voltages are changed from 2.5V to 7V and from 0.5V to 5.6V, respectively, using a 3MHz clock.
    Download PDF (2001K)
  • Shin-ichi O'UCHI, Kazuhiko ENDO, Takashi MATSUKAWA, Yongxun LIU, Tadas ...
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 686-695
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    This paper demonstrates a FinFET operational amplifier (opamp), which is suitable to be integrated with digital circuits in a scaled low-standby-power (LSTP) technology and operates at extremely low voltage. The opamp is consisting of an adaptive threshold-voltage (Vt) differential pair and a low-voltage source follower using independent-double-gate- (IDG-) FinFETs. These two components enable the opamp to extend the common-mode voltage range (CMR) below the nominal Vt even if the supply voltage is less than 1.0V. The opamp was implemented by our FinFET technology co-integrating common-DG- (CDG-) and IDG-FinFETs. More than 40-dB DC gain and 1-MHz gain-bandwidth product in the 500-mV-wide input CMR at the supply voltage of 0.7V was estimated with SPICE simulation. The fabricated chip successfully demonstrated the 0.7-V operation with the 480-mV-wide CMR, even though the nominal Vt was 400mV.
    Download PDF (4017K)
  • Bo LIU, Bo YANG, Shigetoshi NAKATAKE
    Article type: PAPER
    2012 Volume E95.C Issue 4 Pages 696-705
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    Current sources are essential components for analog circuit designs, the mismatch of which causes the significant degradation of the circuit performance. This paper addresses the mismatch model of CMOS current sources, unlike the conventional modeling, focusing on the layout- and λ-dependency of the process variation, where λ is the output conductance parameter. To make it clear what variation parameter influences the mismatch, we implemented a test chip on 90nm process technology, where we can collect the characteristics variation data for MOSFETs of various layouts. The test chip also includes D/A converters to check the differential non-linearity (DNL) caused by the mismatch of current sources when behaving as a DAC. Identifying the variation and the circuit-level errors in the measured DNLs, we reveal that our model can more accurately account for the current variation compared to the conventional mismatch model.
    Download PDF (1940K)
  • Amir FATHI, Sarkis AZIZIAN, Khayrollah HADIDI, Abdollah KHOEI
    Article type: BRIEF PAPER
    2012 Volume E95.C Issue 4 Pages 706-709
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    This paper presents design of a novel high speed booth encoder-decoder in a 0.35µm CMOS technology. Focusing on transistor level implementation of the new architecture and employing newly designed truth table, the gate level delay of the whole system is reduced to one logic gate plus one transistor delay which is the main advantage of the proposed circuit. Simulation results indicate high speed performance of the designed circuit and depict low power dissipation feature of implemented architecture which makes this work suitable for extensive use in high speed arithmetic blocks.
    Download PDF (272K)
  • Amir FATHI, Sarkis AZIZIAN, Khayrollah HADIDI, Abdollah KHOEI
    Article type: BRIEF PAPER
    2012 Volume E95.C Issue 4 Pages 710-712
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    A novel high speed 4-2 compressor using static and pass-transistor logic, has been designed in a 0.35µm CMOS technology. In order to reduce gate level delay and increase the speed, some changes are performed in truth table of conventional 4-2 compressor which leaded to the simplification of logic function for all parameters. Therefore, power dissipation is decreased. In addition, because of similar paths from all inputs to the outputs, the delays are the same. So there will be no need for extra buffers in low latency paths to equalize the delays.
    Download PDF (489K)
  • Shoichi OSHIMA, Mamoru UGAJIN, Mitsuru HARADA
    Article type: BRIEF PAPER
    2012 Volume E95.C Issue 4 Pages 713-716
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    A new low-power feedback structure for a power amplifier (PA) reduces signal distortion while keeping the power efficiency of the PA high. The feedback structure injects the envelope of the third-order harmonics into the input signal. In adopting this method for a class-A amplifier, we obtain over 10% higher efficiency while maintaining the same adjacent channel power ratio (ACPR). The power consumption of additional circuit is 200µW.
    Download PDF (686K)
Regular Section
  • Michinari SHIMODA, Toyonori MATSUDA, Kazunori MATSUO, Yoshitada IYAMA
    Article type: PAPER
    Subject area: Electromagnetic Theory
    2012 Volume E95.C Issue 4 Pages 717-724
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    The cause-and-effect relation between plasmon-resonance absorption and surface wave in a sinusoidal metal grating is investigated. By introducing an equivalent impedance model, similar to an equivalent circuit on an electric circuit, which is an impedance boundary value problem on the fictitious surface over the grating, we estimate the surface wave from the eigen field of the model by using the resonance property of the scattered field. Through numerical examples, we illustrate that the absorption in the grating occurs in the condition of exciting the surface wave along the model, and the real part of the surface impedance is negative on about half part of the fictitious surface in the condition.
    Download PDF (815K)
  • Jun SHIBAYAMA, Keisuke WATANABE, Ryoji ANDO, Junji YAMAUCHI, Hisamatsu ...
    Article type: PAPER
    Subject area: Electromagnetic Theory
    2012 Volume E95.C Issue 4 Pages 725-732
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    A Drude-critical points (D-CP) model for considering metal dispersion is newly incorporated into the frequency-dependent FDTD method using the simple trapezoidal recursive convolution (TRC) technique. Numerical accuracy is investigated through the analysis of pulse propagation in a metal (aluminum) cladding waveguide. The TRC technique with a single convolution integral is found to provide higher accuracy, when compared with the recursive convolution counterpart. The methodology is also extended to the unconditionally stable FDTD based on the locally one-dimensional scheme for efficient frequency-dependent calculations.
    Download PDF (1664K)
  • Alexander EDWARD, Pak Kwong CHAN
    Article type: PAPER
    Subject area: Electronic Circuits
    2012 Volume E95.C Issue 4 Pages 733-743
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    This paper presents analysis and design of a new ultra-low voltage analog front end (AFE) dedicated to strain sensor applications. The AFE, designed in 0.18µm CMOS process, features a chopper-stabilized instrumentation amplifier (IA), a balanced active MOSFET-C 2nd order low pass filter (LPF), a clock generator and a voltage booster which operate at supply voltage (Vdd) of 0.6V. The designed IA achieves 30dB of closed-loop gain, 101dB of common-mode rejection ratio (CMRR) at 50Hz, 80dB of power-supply rejection ratio (PSRR) at 50Hz, thermal noise floor of 53.4 nV/√Hz, current consumption of 14µA, and noise efficiency factor (NEF) of 9.7. The high CMRR and rail-to-rail output swing capability is attributed to a new low voltage realization of the active-bootstrapped technique using a pseudo-differential gain-boosting operational transconductance amplifier (OTA) and proposed current-driven bulk (CDB) biasing technique. An output capacitor-less low-dropout regulator (LDO), with a new fast start-up LPF technique, is used to regulate this 0.6V supply from a 0.8-1.0V energy harvesting power source. It achieves power supply rejection (PSR) of 42dB at frequency of 1MHz. A cascode compensated pseudo differential amplifier is used as the filter's building block for low power design. The filter's single-ended-to-balanced converter is implemented using a new low voltage amplifier with two-stage common-mode cancellation. The overall AFE was simulated to have 65.6dB of signal-to-noise ratio (SNR), total harmonic distortion (THD) of less than 0.9% for a 100Hz sinusoidal maximum input signal, bandwidth of 2kHz, and power consumption of 51.2µW. Spectre RF simulations were performed to validate the design using BSIM3V3 transistor models provided by GLOBALFOUNDRIES 0.18µm CMOS process.
    Download PDF (1179K)
  • Mohammad Reza RESHADINEZHAD, Mohammad Hossein MOAIYERI, Kaivan NAVI
    Article type: PAPER
    Subject area: Electronic Circuits
    2012 Volume E95.C Issue 4 Pages 744-751
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    The reduction in the gate length of the current devices to 65nm causes their I-V characteristics to depart from the traditional MOSFETs. As a result, manufacturing of new efficient devices in nanoscale is inevitable. The fundamental properties of the metallic and semi-conducting carbon Nanotubes (CNTs) make them alternatives to the conventional silicon-based devices. In this paper an ultra high-speed and energy-efficient full adder is proposed, using Carbon Nanotube Field Effect Transistor (CNFET) in nanoscale. Extensive simulation results using HSPICE are reported to show that the proposed adder consumes lower power, and is faster compared to the previous adders.
    Download PDF (1280K)
  • Hideo SAKAI, Shinichi O'UCHI, Takashi MATSUKAWA, Kazuhiko ENDO, Yongxu ...
    Article type: PAPER
    Subject area: Semiconductor Materials and Devices
    2012 Volume E95.C Issue 4 Pages 752-760
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    This paper presents a precise characterization of high-frequency characteristics of intrinsic channel of FinFET. For the de-embedding of the parasitics attached to the source, drain and gate terminals, it proposes special calibration patterns which can place the reference surface just beside the intrinsic part of the FinFET. It compares the measured S parameter data up to 40GHz with the device simulation and shows good matching. The experimental data of the through pattern also confirms the accuracy of the de-embedded parasitics and extracted intrinsic part of FinFET.
    Download PDF (2249K)
  • Yu SUGITA, Yoshifumi TAKASAKI, Keiji KURODA, Yuzo YOSHIKUNI
    Article type: BRIEF PAPER
    Subject area: Optoelectronics
    2012 Volume E95.C Issue 4 Pages 761-764
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    A Fourier domain optical coherence tomography system for obtaining a two-dimensional image is constructed. Imaging characteristics of the OCT system in a transverse direction are experimentally investigated. Angle dependence of reflection intensity from a smooth surface is clearly observed and analyzed with consideration of spatial mode coupling to a fiber.
    Download PDF (272K)
  • Zhisheng LI, Johan BAUWELINCK, Guy TORFS, Xin YIN, Jan VANDEWEGE
    Article type: BRIEF PAPER
    Subject area: Electronic Circuits
    2012 Volume E95.C Issue 4 Pages 765-767
    Published: April 01, 2012
    Released on J-STAGE: April 01, 2012
    JOURNAL RESTRICTED ACCESS
    This paper presents a new common-mode stabilization method for a CMOS differential cascode Class-E power amplifier with LC-tank based driver stage. The stabilization method is based on the identification of the poles and zeros of the closed-loop transfer function at a critical node. By adding a series resistor at the common-gate node of the cascode transistor, the right-half-plane poles are moved to the left half plane, improving the common-mode stability. The simulation results show that the new method is an effective way to stabilize the PA.
    Download PDF (219K)
feedback
Top