Blind-oversampling adaptive oversample-level decision feedback-equalized (DFE) receiver is presented for use in global on-chip serial links. The blind oversampling is adopted to avoid receiver synchronization for reliable channel data reception, and the adaptive oversample-level DFE is used to reduce data-dependent jitter and ease oversampling data recovery regardless of PVT variations. Test results in a 0.13µm CMOS process indicated that the proposed approach achieves up to 37.5% improvement on data rate as compared to conventional approaches, and verified operation at 2.80-Gb/s data rate with 2.31-pJ/bit energy over a 10-mm-long lossy global on-chip interconnect.
Because of the significantly-increasing work-function variation (WFV) in high-k/metal-gate technology in sub-30-nm nodes, a simple but reasonable model for quantitatively estimating the WFV is currently required. In this study, a Monte Carlo simulation for statistically generating the grain sizes following two different probability distributions (i.e., Gaussian and Rayleigh distributions) is suggested and performed. The shapes of the grains created by following the Rayleigh distribution (vs. the Gaussian distribution) are significantly closer to the real shapes of the grains in the metal gate of TiN. Thus, the WFV estimated by using the Rayleigh distribution is well matched to the previous results.
The low utilization of SIMD units and memory bandwidth is the main performance bottleneck on SIMD processors for sparse matrix-vector multiplication (SpMV), which is one of the most important kernels in many scientific and engineering applications. This paper proposes a hybrid optimization method to break the performance bottleneck of SpMV on SIMD processors. The method includes a new sparse matrix compressed format, a block SpMV algorithm, and a vector write buffer. Experimental results show that our hybrid optimization method can achieve an average speedup of 2.09 over CSR vector kernel for all the matrices. The maximum speedup can go up to 3.24.
This work presents the method to enhance the linearity of ramp generator for ADC testing. Through utilizing the parasitic model of capacitance, we develop an extraction method to estimate parasitic components concerning with linearity of integration for ramp generator. The further use of negative impedance converter provides adjustable negative impedance for compensating redundant parasitic components. The resulted impedance is almost pure capacitance enabling the possibility of very linear current integration for ramp generator. The simulation results show that the proposed method is able to generate very linear ramp signal for constant current source. The differential and integral nonlinearity of ramp signal is almost reduced to one-twentieth of that without compensating.
High Efficiency Video Coding (HEVC) is the currently developing video coding standard beyond H.264/AVC. In this paper, a full pipelined 2-D IDCT/IDST VLSI architecture compatible with HEVC standard is presented for the first time. The proposed architecture supports adaptive block size IDCT from 4×4 to 32×32 pixels as well as IDST while keeping nearly 100% hardware utilization. Using SMIC 65nm 1P9M technology, the synthesis results show that the architecture achieves the maximum work frequency at 480MHz and the hardware cost is about 115.8K Gates. Experimental results show that the proposed architecture is able to deal with real-time HEVC IDCT/IDST of 4K×2K (4096×2048)@30fps video sequence at 171MHz in average. In consequence, it offers a cost-effective solution for the future UHDTV applications.
A high-resolution stochastic time-to-digital converter (STDC) using an edge-interchange scheme is described. The proposed STDC provides a higher resolution but consumes less power than previous STDCs that gave the same resolution. The limitation on input phase difference caused by the arbiter and the edge-interchange circuit is analyzed. Simulated results show that for the task proposed herein, a resolution of up to 0.3ps is achieved while only 1.7mW is consumed. Furthermore, higher resolution is achieved, more power will be reduced by using the edge-interchange circuit.