Abstract
Execution performance is critical for large-scale and data-intensive workflows. This paper proposes DISWOP, a novel scheduling algorithm for data-intensive workflow optimizations; it consists of three main steps: workflow process generation, task & resource mapping, and task clustering. To evaluate the effectiveness and efficiency of DISWOP, a comparison evaluation of different workflows is conducted a prototype workflow platform. The results show that DISWOP can speed up execution performance by about 1.6-2.3 times depending on the task scale.