Mobile computing devices employ multiple task-specific IP cores to improve performance and energy. With the main memory shared by all CPU cores and IPs, it becomes a critical bottleneck as increasing multimedia IPs consume significant memory bandwidth whereas the improvement of memory bandwidth lags behind. We propose an IP-aware cache management scheme for the last-level cache to reduce the load on the main-memory, while satisfying QoS requirements of frame-based multimedia applications. Evaluation with CPU applications and 4 K/60 fps video streaming applications shows that our proposed scheme reduces DRAM bandwidth by 28.62% on average over LRU and completely avoids frame-drops.
An efficient background timing skew calibration algorithm is proposed in this article, which detects the sampling time mismatches in time-interleaving analog-to digital converter (TIADC) by estimating the skew-related errors with a reference channel and aligns the sampling edge of each sub-ADC to that of the reference channel by analog variable-delay lines in the negative feedback loop. Compared with conventional background calibration methods based on complex algorithms or serious input restrictions, the proposed technique detects timing skews by only negligible hardware consisting of simple digital blocks and is applicable for a wide range of input including completely random signals. The detailed theoretical analysis and sufficient simulated results revealed that this algorithm is not sensitive to some non-ideal components in actual circuits like mismatches between channels or jitters in clock circuits, which verifies the practicability and robustness of this method.
This study presents an 8 Gbps low-power source-series terminated (SST) transmitter for high-speed serial links. The proposed transmitter consists of a novel hybrid 20:2 multiplexer followed by a three-tap feed-forward equalizer (FFE) and a shunt path SST driver. In addition, a high-precision impedance calibration circuit with slice unit replication is proposed to match the characteristic impedance of channel, whose maximum calibration error is 0.002%. Fabricated in 55-nm CMOS technology and has area of 0.024 mm2. Measurement results show that the transmitter achieves data rate of 8 Gbps while maintaining good performance. The transmitter has output swing of 510 mV with −6 dB post-tap equalization, and it consumes 15.1 mW under 1.2 V power supply.
In this paper, we proposed an improved design method of critical path replica (CPR) for wide voltage design. Timing accuracy of CPR in wide operating voltage is improved by applying load matching and transistor-level static timing analysis (TSTA). We applied proposed method to 100 critical paths of iscas’95 benchmark circuits, the results of simulation experiments in SMIC 55 nm shows that the CPR designed by proposed method can operating between 0.3 V–1.2 V with only 0.25% delay error (DE).
A three-dimensional (3-D) frequency selective surface (FSS) based on multiple square coaxial waveguides (SCWs) is proposed, which realizes a dual-bandpass response with close band spacing. In the unit cell of the proposed 3-D FSS, two concentric SCW propagation paths and one parallel-plate waveguide (PPW) propagation path are constructed by utilizing three square metallic tubes. Each SCW propagation path intrinsically generates one transmission pole by the square slot resonance, and the PPW propagation path produces another transmission pole by the half-wavelength resonance. In addition, two transmission zeros are introduced due to the counteraction of out-of-phase signals between different paths, improving the frequency selectivity. After properly adjusting the design parameters, two desired passbands are obtained around 4.4 and 4.845 GHz, and the band ratio is only 1.1. Through the electric-field distributions, equivalent circuit model and parameters study, the operating principle of the proposed 3-D FSS is investigated. A prototype of the proposed 3-D FSS is fabricated and measured. The measured results show that the proposed design can achieve a stable response under the incident angles up to 45° for both TE and TM polarizations.
In this letter, a combinational logic reduced belief propagation (BP) decoder for polar codes is designed in 55 nm CMOS technology. The authors first introduced the BP decoding algorithm for polar codes, and then analyzed the architectures of the conventional BP decoders. Finally, the hardware implementation with the proposed multiplexed process element architecture is presented. Synthesis results show that the consumption of hardware resources is reduced by 36%. The architecture and circuit techniques reduce the power to 398 mW for an energy efficiency of 292 pJ/b. The throughput is improved to 4.36 Gbps by applying the G-matrix early stopping criteria.
A mm-wave 5-bit digital attenuator with low RMS (root mean square) amplitude error and low phase variation is presented in 65 nm CMOS. The attenuator combines the PI/T-type topology with embedded switches and PI-type topology with the SPDT (single-pole-double-throw) switches to alleviate the insertion loss issue of the conventional PI/T-type topology with embedded switches in mm-wave frequency band, and achieves high attenuation range while maintaining compact chip size. The amplitude/phase calibration technique is proposed to reduce the RMS amplitude error/phase variation and improve the circuit robustness. The presented attenuator has been integrated in a Ka-band phased-array transmit front-end module and achieves 15.5 dB attenuation coverage with the step of 0.5 dB. The RMS amplitude error and RMS phase variation are 0.13–0.48 dB and 1.24–2.08° across 25–35 GHz, respectively. Especially, the proposed attenuator could achieve the RMS amplitude error of 0.13–0.25 dB if the operation frequency is limited to 26.9–31.4 GHz. The core chip size is 434 µm × 360 µm.
Pair Hidden Markov Model (Pair-HMM) forward algorithm is gaining increasing popularity in biological research tools. We propose a novel non-cooperative structure of Pair-HMM forward algorithm accelerator on Field Programmable Gate Array (FPGA). We employ a task-level parallel scheme in the structure. We design the non-cooperative Processing Element (PE) to complete Pair-HMM forward algorithm independently. Our three-layer tree topology improves the scalability for different FPGAs. Compared to previous works, our structure reduces the idle cycles which occurs in the systolic array structure and the PE ring structure. Compared with the PE ring, our implementation on Arria 10 can achieve 1.19× speedups.
In this letter, a compact low cost ultra-wideband elliptic antenna was presented and discussed. The proposed antenna is printed on a 1.6 mm thick FR4 with size of 21 × 27 × 1.6 mm3. It was composed of a 4 ellipses radiating patch and a reduced ground plane with three rectangular slots. The measurement shows that the considered antenna operates over a wide impedance bandwidth of 16.26 GHz (3.12 GHz to 19.38 GHz) with a maximum Gain of 6 dBi with an omni-directional radiation pattern. Details of simulation and experimental results are presented and discussed.