We describe the recent progress on a Nb nine-layer fabrication process for large-scale single flux quantum (SFQ) circuits. A device fabricated in this process is composed of an active layer including Josephson junctions (JJ) at the top, passive transmission line (PTL) layers in the middle, and a DC power layer at the bottom. We describe the process conditions and the fabrication equipment. We use both diagnostic chips and shift register (SR) chips to improve the fabrication process. The diagnostic chip was designed to evaluate the characteristics of basic elements such as junctions, contacts, resisters, and wiring, in addition to their defect evaluations. The SR chip was designed to evaluate defects depending on the size of the SFQ circuits. The results of a long-term evaluation of the diagnostic and SR chips showed that there was fairly good correlation between the defects of the diagnostic chips and yields of the SRs. We could obtain a yield of 100% for SRs including 70,000JJs. These results show that considerable progress has been made in reducing the number of defects and improving reliability.
The single flux quantum (SFQ) is expected to be a next-generation high-speed and low-power technology in the field of logic circuits. CMOS as the dominant technology for conventional processors cannot be replaced with SFQ technology due to the difficulty of implementing feedback loops and conditional branches using SFQ circuits. This paper investigates the applicability of a reconfigurable data-path (RDP) accelerator based on SFQ circuits. The authors introduce detailed specifications of the SFQ-RDP architecture and compare its performance and power/performance ratio with those of a graphics-processing unit (GPU). The results show at most 1600 times higher efficiency in terms of Flops/W (floating-point operations per second/Watt) for some high-performance computing application programs.
Superconducting Single-Flux-Quantum (SFQ) devices have been paid much attention as alternative devices for digital circuits, because of their high switching speed and low power consumption. For large-scale circuit design, the role of computer-aided design environment is significant. As the characteristics of the SFQ devices are different from conventional devices, a new design environment is required. In this paper, we propose a new timing-aware circuit description method which can be used for SFQ circuit design. Based on the description and the dedicated algorithms we have been developing for SFQ logic circuit design, we propose an integrated design flow for SFQ logic circuits. We have designed a circuit using our developed design tools along with the design flow and demonstrated the correct operation.
We describe a large-scale integrated circuit (LSI) design of rapid single-flux-quantum (RSFQ) circuits and demonstrate several reconfigurable data-path (RDP) processor prototypes based on the ISTEC Advanced Process (ADP2). The ADP2 LSIs are made up of nine Nb layers and Nb/AlOx/Nb Josephson junctions with a critical current density of 10kA/cm2, allowing higher operating frequencies and integration. To realize truly large-scale RSFQ circuits, careful design is necessary, with several compromises in the device structure, logic gates, and interconnects, balancing the competing demands of integration density, design flexibility, and fabrication yield. We summarize numerical and experimental results related to the development of a cell-based design in the ADP2, which features a unit cell size reduced to 30-µm square and up to four strip line tracks in the unit cell underneath the logic gates. The ADP LSIs can achieve ∼10 times the device density and double the operating frequency with the same power consumption per junction as conventional LSIs fabricated using the Nb four-layer process. We report the design and test results of RDP processor prototypes using the ADP2 cell library. The RDP processors are composed of many arrays of floating-point units (FPUs) and switch networks, and serve as accelerators in a high-performance computing system. The prototypes are composed of two-dimensional arrays of several arithmetic logic units instead of FPUs. The experimental results include a successful demonstration of full operation and reconfiguration in a 2×2 RDP prototype made up of 11.5k junctions at 45GHz after precise timing design. Partial operation of a 4×4 RDP prototype made up of 28.5k-junctions is also demonstrated, indicating the scalability of our timing design.
We report the successful operation of a low-power arithmetic logic unit (ALU) based on a low-voltage rapid single-flux-quantum (LV-RSFQ) logic circuit, whereby a dc bias current is fed to circuits from lowered constant-voltage sources through small resistors. Both the static and dynamic energy consumptions are reduced because of the reduction in the amplitudes of voltage pulses across the Josephson junctions, with a trade-off of slightly slower switching speeds. The designed bias voltage was set to 0.25mV, which is one-tenth that of our standard RSFQ circuit design. We investigated several issues related to such low-voltage operation, including margins and timing design. To achieve successful operation, we tuned the circuit parameters in the logic gate design and carefully controlled the timing by considering the interference of pulse signals. We show test results for the low-voltage ALU in on-chip high-speed testing. The circuit was fabricated using the AIST Nb/AlOx/Nb Advanced Process with a critical current density of 10kA/cm2. We verified that arithmetic and logical operations were correctly implemented and obtained dc bias margins of 18% at a target clock frequency of 20GHz and achieved a maximum clock frequency of 28GHz with a power consumption of 28µW. These experimental results indicate energy efficiency of 3.6 times that of the standard RSFQ circuit design.
We propose an improved design of a neuron circuit, using coupled SQUIDs gates, for a superconducting neural network. An activation function with step-like input vs. output characteristics is desirable for a neuron circuit to solve a combinatorial optimization problem. The proposed neuron circuit is composed of two coupled SQUIDs gates with a cascade connection, in order to obtain such characteristics. The designed neuron circuit is fabricated by a 2.5kA/cm2 Nb/AlOx/Nb process. The operation of a fabricated neuron circuit is experimentally demonstrated. Network performance of a neural network using proposed neuron circuits is also estimated by numerical dynamic simulations.
A promising application of a single-flux quantum (SFQ) circuit is read-out circuitry for a multi-channel superconductive sensor array. In such applications, the SFQ read-out circuit is expected to operate outside a magnetic shield. We investigated an SFQ circuit structure, which is tolerant to an external magnetic field, using the AIST 2.5kA/cm2 Nb standard 2 process, which has four Nb wiring layers including the ground plane. By covering the entire circuit using an upper Nb wiring layer called the control (CTL) layer, the influences of the external magnetic field on the SFQ circuit operation can be avoided. We experimentally evaluated the sheet inductance of the wiring layer underneath the CTL shielding layer to design a magnetic-field-tolerant SFQ circuit. We implemented and measured test circuits comprising toggle flip-flops (TFFs) to evaluate their magnetic field tolerances. The operating margin and maximum operating frequency of the designed TFF did not deteriorate with increases in the magnetic field applied to the test circuit, whereas the operating margin of the conventional TFF was reduced by applying the magnetic field. We have also demonstrated the high-speed operation of the designed TFF operated in an unshielded environment at a frequency of up to 120GHz with a wide operating margin.
We have been developing a superconducting time-of-flight mass spectrometry (TOF-MS) system, which utilizes a superconductive strip ion detector (SSID) and a single-flux-quantum (SFQ) multi-stop time-to-digital converter (TDC). The SFQ multi-stop TDC can measure the time intervals between multiple input signals and directly convert them into binary data. In this study, we designed and implemented 24-bit SFQ multi-stop TDCs with a 3×24-bit FIFO buffer using the AIST Nb standard process (STP2), whose time resolution and dynamic range are 100ps and 1.6ms, respectively. The timing jitter of the TDC was investigated by comparing two types of TDCs: one uses an on-chip SFQ clock generator (CG) and the other uses a microwave oscillator at room temperature. We confirmed the correct operation of both TDCs and evaluated their timing jitter. The experimentally-obtained timing jitter is about 40ns and 700ps for the TDCs with and without the on-chip SFQ CG, respectively, for the measured time interval of 50µs, which linearly increases with increase of the measured time interval.
Recently, we proposed a new data-path architecture, named a large-scale reconfigurable data-path (LSRDP), based on single-flux-quantum (SFQ) circuits, to establish a fundamental technology for future high-end computers. In this architecture, a large number of SFQ floating-point units (FPUs) are used as core components, and their high performance and low power consumption are essential. In this research, we implemented an SFQ half-precision bit-serial floating-point multiplier (FPM) with a target clock frequency of 50GHz, using the AIST 10kA/cm2 Nb process. The FPM was designed, based on a systolic-array architecture. It contains 11,066 Josephson junctions, including on-chip high-speed test circuits. The size and power consumption of the FPM are 6.66mm × 1.92mm and 2.83mW, respectively. Its correct operation was confirmed at a maximum frequency of 93.4GHz for the exponent part and of 72.0GHz for the significand part by on-chip high-speed tests.
We present our design and operation of a 6-bit quasi-triangle voltage waveform generator comprising three circuit blocks; an improved variable Pulse Number Multiplier (variable-PNM), a Code Generator (CG), and a Double-Flux-Quantum Amplifier (DFQA). They are integrated into a single chip using a niobium Josephson junction technology. While the multiplication factor of our previous m-bit variable-PNM was limited between 2m-1 and 2m, that of the improved one is extended between 1 and 2m. Correct operations of the 6-bit variable-PNM are confirmed in low-speed testing with respect to the codes from the CG, whereas generation of a 6-bit, 0.20mVpp quasi-triangle voltage waveform is demonstrated with the 10-fold DFQA in high-speed testing.
This paper is concerned with a method to reduce the computation time of the Discrete Ray Tracing Method (DRTM) which was proposed to numerically analyze electromagnetic fields above Random Rough Surfaces (RRSs). The essence of DRTM is firstly to search rays between source and receiver and secondly to compute electric fields based on the traced rays. In the DRTM, the method discretizes not only RRSs but also ray tracing procedure. In order to reduce computation time for ray searching, the authors propose to modify the conventional algorithm discretizing RRSs with equal intervals to a new one which discretizes them with unequal intervals according to their profiles. The authors also use an approximation of Fresnel function which enables us to reduce field computation time. The authors discuss the reduction rate for computation time of the DRTM from the numerical view points of ray searching and field computation. Finally, this paper shows how much computation time is reduced by the new method.
A digital background calibration technique using signal-dependent dithering is proposed, to correct the nonlinear errors which results from capacitor mismatches and finite opamp gain in pipelined analog-to-digital converter (ADC). Large magnitude dithers are used to measure and correct both errors simultaneously in background. In the proposed calibration system, the 2.5-bit capacitor-flip-over multiplying digital-to-analog converter (MDAC) stage is modified for the injection of large magnitude dithering by adding six additional comparators, and thus only three correction parameters in every stage subjected to correction were measured and extracted by a simple calibration algorithm with multibit first stage. Behavioral simulation results show that, using the proposed calibration technique, the signal-to-noise-and-distortion ratio improves from 63.3 to 79.3dB and the spurious-free dynamic range is increased from 63.9 to 96.4dB after calibrating the first two stages, in a 14-bit 100-MS/s pipelined ADC with σ=0.2% capacitor mismatches and 60dB nonideal opamp gain. The time of calibrating the first two stages is around 1.34 seconds for the modeled ADC.
In order to analyze an impact of threshold voltage (Vth) fluctuation induced by random telegraph noise (RTN) on LSI circuit design, we measured a 40-nm 6-Tr-SRAM TEG which enables to evaluate individual bit-line current. RTN phenomenon was successfully measured and we also identified that the transfer MOSFET in an SRAM bit-cell was the most sensitive MOSFET. The proposed word line boosting technique, which applies slightly extra stress to the transfer MOSFET, improves about 30% of detecting probability of fail-bit cells caused by RTN.
A reduction in the intensity deviation of a nine-channel optical frequency comb block (OFCB) is demonstrated, by adopting an asymmetric differential drive method for an InP-based dual drive Mach-Zehnder modulator. The generation of a tailored OFCB with an intensity deviation of less than 0.8dB is confirmed by using the modulator.