Recently proposed pipelined multithreading (PMT) techniques have shown great applicability to parallelizing general programs on multi-core processors. However, the potential performance of these techniques is limited by the large inter-core communication overheads which become a performance bottleneck. This paper addresses this problem and presents a novel clustered pipelined multithreading (CPMT) technique that can construct efficient pipeline parallelism on commodity multi-core processors. This technique combines a clustered communication mechanism that can greatly reduce average communication overheads (ACOs) in software only approach. We quantitatively demonstrate the performance of CPMT can be improved through reducing the ACOs and show the performance characteristics. Moreover, we also give the stage decomposition procedure and provide a stage execution framework that can execute the multiple stages within one procedure. The effectiveness of CPMT technique has been evaluated on the commodity AMD Phenom four-core processors. Experimental results show that our CPMT technique achieves speedup ranging from 116.8% to 219.8% on some typical loops extracted from SPEC CPU 2000 benchmark programs.
Hypervolume is one of the most frequently-used and well-known performance measures to compare the performance of the obtained solution sets by evolutionary multiobjective optimization (EMO) algorithms. Hypervolume is used to evaluate both the convergence of solutions to the Pareto front and their diversity. The main difficulty in the use of the hypervolume is that the computation load for its calculation increases exponentially with the number of objectives. In this paper, we propose an idea of approximating the hypervolume of an obtained non-dominated solution set using a number of achievement scalarizing functions with uniformly distributed weight vectors. We use each scalarizing function to measure the distance from the reference point of the hypervolume to the attainment surface of the obtained non-dominated solution set along its own search direction. We examine the effect of the number of weight vectors on the approximation accuracy and the computation load. Through computational experiments, we show that the approximation accuracy is improved by increasing the number of weight vectors. We also show that our idea needs much less computational load than the existing hypervolume calculation for many-objective problems.