p. 55-58
Correct data collection and reasonably timely data processing are very important in Big Data analysis. Furthermore, interpreting the analyzed result is also an interesting issue. Although many sophisticated data mining techniques are already available, they cannot be applied directly to process mining, due to the input-data format differences. For example, whereas data mining techniques focus on the relations between attributes without considering the process, formatting data as row-based instances, process mining finds flow-patterns among instances, formatting data as column-based instances in the MXML/XES format. In the present study, we utilized a sequential dataset to enable the use of more enhanced statistical methods and to broaden the utilization of process analysis to many sophisticated data mining techniques. We experimented on artificial data to calculate the activities probability distribution using machine learning and the probability density function (PDF).