2015 Volume 12 Issue 10 Pages 20150314
The traditional post-TnL vertex cache (abbr. ‘post-VC’) in embedded GPUs (EGPUs) with only one vertex or unified shader does not fit to multi-shader EGPUs for two reasons. As multiple shaders run in parallelism, (a) the out-of-order vertex processing may raise the post-VC inconsistency that leads to cache the error data, and (b) it is very hard to detect in time which vertices are saved in the post-VC in the stage of vertex fetching, resulting in the low performance. In this paper, we propose a modified post-VC including a decoupling cache and a vertex batch in-order commit controller, which can guarantee that the data SRAM and index tag can be updated in-order according to the same replacement policy in the different stages of vertex processing. The function of the proposed post-VC is verified on a FPGA-based platform. Experimental results show that it increases the performance by an average of 172% and 80.6% compared to the EGPU without/with the traditional post-VC respectively at a little expense.