IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Reducing I/O Cost in OLAP Query Processing with MapReduce
Woo-Lam KANGHyeon-Gyu KIMYoon-Joon LEE
Author information
JOURNAL FREE ACCESS

2015 Volume E98.D Issue 2 Pages 444-447

Details
Abstract
This paper presents a method to reduce I/O cost in MapReduce when online analytical processing (OLAP) queries are used for data analysis. The proposed method consists of two basic ideas. First, to reduce network transmission cost, mappers are organized to receive only data necessary to perform a map task, not an entire set of input data. Second, to reduce storage consumption, only record IDs are stored for checkpointing, not the raw records. Experiments conducted with TPC-H benchmark show that the proposed method is about 40% faster than Hive, the well-known data warehouse solution for MapReduce, while reducing the size of data stored for checkpoining to about 80%.
Content from these authors
© 2015 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top