In this paper, a hamburger architecture with a 3D stacked reconfigurable memory is proposed for a 4K motion estimation (ME) processor. By positioning the memory dies on both the top and bottom sides of the processor die, the proposed hamburger architecture can reduce the usage of the signal through-silicon via (TSV), and balance the power delivery network and the clock tree of the entire system. It results in 1/3 reduction of the usage of signal TSVs. Moreover, a stacked reconfigurable memory architecture is proposed to reduce the fabrication complexity and further reduce the number of signal TSVs by more than 1/2. The reduction of signal TSVs in the entire design is 71.24%. Finally, we address unique issues that occur in electronic design automation (EDA) tools during 3D large-scale integration (LSI) designs. As a result, a 4K ME processor with 7-die stacking 3D system-on-chip design is implemented. The proposed design can support real time 3840 × 2160 @ 120 fps encoding at 130 MHz with less than 540 mW.
An 8-issue superscalar core generally requires a 24-port RAM for the register file. The area and energy consumption of a multiported RAM increase in proportional to the square of the number of ports. A register cache can reduce the area and energy consumption of the register file. However, earlier register cache systems suffer from lower IPC caused by register cache misses. Thus, we proposed the Non-Latency-Oriented Register Cache System (NORCS) to solve the IPC problem with a modified pipeline. We evaluated NORCS mainly from the viewpoint of microarchitecture in the original article, and showed that NORCS maintains almost the same IPC as conventional register files. Researchers in NVIDIA adopted the same idea for their GPUs. However, the evaluation was not sufficient from the viewpoint of LSI design. In the original article, we used CACTI to evaluate the area and energy consumption. CACTI is a design space exploration tool for cache design, and adopts some rough approximations. Therefore, this paper shows design of NORCS with FreePDK45, an open source process design kit for 45nm technology. We performed manual layout of the memory cells and arrays of NORCS, and executed SPICE simulation with RC parasitics extracted from the layout. The results show that, from a full-port register file, an 8-entry NORCS achieves a 75.2% and 48.2% reduction in area and energy consumption, respectively. The results also include the latency which we did not present in our original article. The latencies of critical path is 307ps and 318ps for an 8-entry NORCS and a conventional multiported register file, respectively, when the same two cycles are allocated to register file read.
Timing fault detection techniques address the problems caused by increased variations on a chip, especially with dynamic voltage and frequency scaling (DVFS). The Razor flip-flop (FF) is a timing fault detection technique that employs double sampling by the main and shadow FFs. In order for the Razor FF to correctly detect a timing fault, not the main FF but the shadow FF must sample the correct value. The application of Razor FFs to static logic relaxes the timing constraints; however, the naive application of Razor FFs to dynamic precharged logic such as SRAM read circuits is not effective. This is because the SRAM precharge cannot start before the shadow FF samples the value; otherwise, the transition of the bitline of the SRAM stops and the value sampled by the shadow FF will be incorrect. Therefore, the detect period cannot overlap the precharge period. This paper proposes a novel application of Razor FFs to SRAM read circuits. Our proposal employs a conditional precharge according to the value of a bitline sampled by the main FF. This enables the detect period to overlap the precharge period, thereby relaxing the timing constraints. The additional circuit required by this method is simple and only needed around the sense amplifier, and there is no need for a clock delayed from the system clock. Consequently, the area overhead of the proposed circuit is negligible. This paper presents SPICE simulations of the proposed circuit. Our proposal reduces the minimum cycle time by 51.5% at a supply voltage of 1.1 V and the minimum voltage by 31.8% at cycle time of 412.5 ps.
An all-digital fully-synthesizable PVT-tolerant clock data recovery (CDR) architecture for wireline chip-to-chip interconnects is presented. The proposed architecture enables the co-synthesis of the CDR with the digital core. By eliminating the resource hungry manual layout and interfacing steps, which are necessary for conventional CDR topologies, the design process and the time-to-market can be drastically improved. Besides, the proposed CDR architecture enables the re-usability of majority of the sub-systems which enables easy migration to different process nodes. The proposed CDR is also equipped with a self-calibration scheme for ensuring tolerence over PVT. The proposed fully-syntehsizable CDR was implemented in 28nm FDSOI. The system achieves a maximum data rate of 10.06Gbps while consuming a power of 16.1mW from a 1V power supply.
We report on the fabrication of a magnetic metallic contaminant detector using multi-channel high-Tc RF-SQUIDs (superconducting quantum interference devices) for large packaged food. For food safety finding small metallic contaminants is an important issue for a food manufacturer. Hence, a detection method for small sized contaminants is required. Some detection systems for food inspection using high-Tc SQUIDs have been reported to date. The system described here is different from the previous systems in its permitted size for inspection, being larger at 150mm in height × 300mm in width. For inspection of large sized food packages, improvement of the signal to noise ratio (SNR) is an important issue because the signal intensity is inversely proportional to the cube of the distance between the SQUID sensor and the object. Therefore a digital filter was introduced and its parameters were optimized. As a result, a steel ball as small as 0.5mm in diameter at a stand-off distance of 167mm was successfully detected with more than SNR = 3.3.
In this review, we present recent advances relating to superconducting nanowire single-photon detectors (SSPDs or SNSPDs) and their broad range of applications. During a period exceeding ten years, the system performance of SSPDs has been drastically improved, and lately excellent detection efficiencies have been realized in practical systems for a wide range of target photon wavelengths. Owing to their advantages such as high system detection efficiency, low dark count rate, and excellent timing jitter, SSPDs have found application in various research fields such as quantum information, quantum optics, optical communication, and also in the life sciences. We summarize the photon detection principle and the current performance status of practical SSPD systems. In addition, we introduce application examples in which SSPDs have been applied.
Superconducting Transition edge sensor (TES) coupled with a heavy metal absorber is a promising microcalorimeter for Gamma-ray (γ-ray) spectroscopy with ultra-high energy resolution and high detection efficiency. It is very useful for the non-destructed inspection of the nuclide materials. High resolving power of γ-ray peaks can precisely identify multiple nuclides such as Plutonium (Pu) and Actinides with high efficiency and safety. For this purpose, we have developed the TES coupled with a tin absorber. We suggest the new device structure using the gold bump post which connects a tin absorber to the thermometer of the superconducting Ir/Au bilayer. High thermal conductivity of the gold bump post realized strong thermal coupling between the thermometer and the γ-ray absorber, and it brought the benefit of large pulse height and fast decay time. Our TES achieved the good energy resolution of 84 eV FWHM at 59.5 keV. Using this TES device, we also succeeded to demonstrate the nuclear material measurements. In the measurement of a Pu sample, we detected the sharp γ-ray peaks from 239Pu and 240Pu, and of a Fission Products (FP) sample, we observed fluorescence X-ray peaks emitted by the elements contained in FP. The TES could resolve the fine structures of each fluorescence X-ray line like Kα1 and Kα2. In addition to that, we developed the TES coupled with tantalum absorber, which is expected to have higher absorption efficiency for γ-rays. This device reported the best energy resolution of 465 eV at 662 keV.
In this paper, we describe the fabrication of low leakage Superconductor/Insulator/Superconductor (SIS) junctions with a Nb/Al/AlOx/Al/Nb structure. In other words, an extra Al layer was added onto the top of the insulator in a conventional Nb/Al/AlOx/Nb junction. We measured the current and voltage (IV) characteristics of both the Nb/Al/AlOx/Al/Nb and Nb/Al/AlOx/Nb junctions at the temperature of liquid helium, and found that the sub-gap leakage current in the Nb/Al/AlOx/Al/Nb junctions was much lower than that of the Nb/Al/AlOx/Nb junctions. Our analysis of the IV characteristics indicates that the quality of the AlOx insulator used in the Nb/Al/AlOx/Al/Nb junction was close to ideal, while the insulator used in the Nb/Al/AlOx/Nb junction had possible defects. According to the scanning transmission electron microscope (STEM) images and energy-dispersive X-ray spectroscopy (EDX) analyses, it was evident that the Nb atoms diffused into the bottom electrode of the Nb/Al/AlOx/Nb junction, while a smaller number diffused into the bottom electrode of the Nb/Al/AlOx/Al/Nb junction. Therefore, we conclude that the extra Al layer effectively acted as a buffer layer that prevented the Nb atoms from diffusing into the insulator and bottom electrode. The presence of the top Al layer is expected to favorably improve the quality of junctions with a very high current density, and support the extension of the RF and IF bandwidths of SIS mixers.
Antenna-coupled kinetic inductance detectors (KIDs) have recently shown great promise as microwave detection systems with a large number of channels. However, this technique, still has difficulties in eliminating the radiation loss of the resonator signals. To solve this problem, we propose a design in which the absorption area connected to an antenna is located on the ground-side of a coplanar waveguide. Thereby, radiation loss due to leakage from the resonator to the antenna can be considerably reduced. This simple design also enables the use of a contact aligner for fabrication. We have developed KIDs with this design, named as the ground-side absorption (GSA)-KIDs and demonstrated that they have higher quality factors than those of the existing KIDs, while maintaining a good total sensitivity.
An equivalent-circuit model is an effective tool for the analysis and design of metamaterials. This paper describes a systematic and theoretical method for the circuit modeling of meta-atoms. We focus on the structures of wired metallic spheres and propose a method for deriving a sophisticated equivalent circuit that has the same topology as the wires using the partial element equivalent circuit (PEEC) method. Our model contains the effect of external electromagnetic coupling: excitation by an external field modeled by voltage sources and radiation modeled by the radiation resistances for each mode. The equivalent-circuit model provides the characteristics of meta-atoms such as the resonant frequencies and the resonant modes induced by the current distribution in the wires by an external excitation. Although the model is obtained by a very coarse discretization, it provides a good agreement with an electromagnetic simulation.
In this paper, a non-isolated bidirectional DC-DC converter with zero voltage switching and constant switching frequency is proposed. Unlike the active clamp bidirectional converters, to create soft switching condition in both direction, only one auxiliary switch is used, which reduces conduction losses and the complexity of the circuit. The proposed converter is controlled by pulse width modulation and the switches are gated complementary, thus the implementation of the control circuit is simple. Low switching losses, high efficiency, high power density, are the advantages of this converter. The simulation and experimental results of the converter verify theoretical analysis. Based on an implemented prototype of the proposed converter at 80 watts, the measured efficiency is 96.5%.
The major task in compact modeling for high power devices is to predict the switching waveform accurately because it determines the energy loss of circuits. Device capacitance mainly determines the switching characteristics, which makes accurate capacitance modeling inevitable. This paper presents a newly developed compact model HiSIM-GaN [Hiroshima University STARC IGFET Model for Gallium-Nitride-based High Electron Mobility Transistors (GaN-HEMTs)], where the focus is given on the accurate modeling of the field-plate (FP), which is introduced to delocalize the electric-field peak that occurs at the electrode edge. We demonstrate that the proposed model reproduces capacitance measurements of a GaN-HEMT accurately without fitting parameters. Furthermore, the influence of the field plate on the studied circuit performance is analyzed.
A comprehensive model is presented for estimating the bit error rate (BER) of write disturbance in a resistive memory composed of a cross-point array. While writing a datum into the selected address, the non-selected addresses are biased by word-line (WL) and bit-line (BL). The stored datum in the non-selected addresses will be disturbed if the bias is large enough. It is necessary for the current flowing through the non-selected address to be calculated in order to estimate the BER of the write disturbance. Since it takes a long time to calculate the current flowing in a large-scale cross-point array, several simplified circuits have been utilized to decrease the calculating time. However, these simplified circuits are available to the selected address, not to the non-selected one. In this paper, new simplified circuits are proposed for calculating the current flowing through the non-selected address. The proposed and the conventional simplified circuits are used, and on that basis the trade-off between the write disturbance and the write error is discussed. Furthermore, the error correcting code (ECC) is introduced to improve the trade-off and to provide the low-cost memory chip matching current production lines.
A wideband noise-cancelling receiver front-end is proposed in this brief. As a basic architecture, a low-noise transconductance amplifier, a passive mixer, and a transimpedance amplifier are employed to compose the wideband receiver. To achieve wideband input matching for the transconductor, a global feedback method is adopted. Since the wideband receiver has to minimize linearity degradation if a large blocker signal exists out-of-band, a linearization technique is applied for the transconductor circuit. The linearization cancels third-order intermodulation distortion components and increases linearity; however, the additional circuits used in linearization generate excessive noise. A noise-cancelling architecture that employs an auxiliary path cancels noise signals generated in the main path. The designed receiver front-end is fabricated using a 65-nm CMOS process. The receiver operates in the frequency range of 25 MHz-2 GHz with a gain of 49.7 dB. The in-band input-referred third-order intercept point is improved by 12.3 dB when the linearization is activated, demonstrating the effectiveness of the linearization technique.
This work reports a novel power-rail electrostatic discharge (ESD) clamp circuit with parasitic bipolar-junction-transistor (BJT) and channel parallel shunt paths. The parallel shunt paths are formed by delivering a tiny ratio of drain voltage to the gate terminal of the clamp device in ESD events. Under such a mechanism, the proposed circuit achieves enhanced robustness over those of both gate-grounded NMOS (ggNMOS) and the referenced gate-coupled NMOS (gcNMOS). Besides, the proposed circuit also achieves improved fast power-up immunity over that of the referenced gcNMOS. All investigated designs are fabricated in a 65-nm CMOS process. Transmission-line-pulsing (TLP) and human-body-model (HBM) test results have both confirmed the performance enhancements of the proposed circuit. Finally, the validity of the achieved performance enhancements on other trigger circuits is essentially revealed in this work.