Abstract
In this paper, we propose a MPI-based framework and a library for geospatial vector data processing to perform efficient load balancing in heterogeneous distributed systems and hide MPI programming. Our experimental results show that our proposed framework is up to 1.63 times faster than MR4C, which uses Hadoop YARN for the load balancing. Also, the number of code steps of geospatial vector data processing with our library, which hides the MPI programming, smaller than that of the MR4C.