In this paper, we introduce our software for huge amount of data written in the Java language.
Our software have some aggregation methods and predictive simulation method with the Hadoop cluster.
The Apache Hadoop software library is a framework for the parallel and distributed processing of huge data sets on the cluster of computers. MapReduce is a programming model designed for processing large volumes of data in parallel by dividing the work into a set of independent tasks. The Apache Hadoop MapReduce is an implementation of MapReduce for the Apache Hadoop.
By the increase in the data size, we need to analyze the huge amount of data, which were not able to be carried until now. It is possible to analyze them with the big data technologies such as the Apache Hadoop. We can develop Apache Hadoop MapReduce applications using the Java language. Our software can perform the parallel and distributed processing of huge amount of data. First, we show simple example of a MapReduce application to explain the MapReduce feature and the key-value store. Next, we introduce our software methods, which are aggregation and predictive simulation with data sets of Joint Association Study Group of Management Science.
View full abstract