Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
Recommendation of Imputing Value for Sensor Data based on Programming by Example
Hiroko NagashimaYuka Kato
Author information
JOURNAL FREE ACCESS

2020 Volume 28 Pages 102-111

Details
Abstract

Large volumes of data are typically used during analyses. Data preprocessing, which involves detecting outliers, handling missing data, data formatting, integration, and normalization, is essential for achieving accurate results. Many tools and methods are available for reducing preprocessing time. However, most analysts face difficulties when using them. This paper proposes a method for handling outliers and missing data, called Automated PRE-Processing for Sensor Data (APREP-S). For reducing analysis resources, we combine programming by example and machine learning via Bayesian inference, inputting human knowledge to APREP-S as an example and calculating a proper proportion by machine learning via Bayesian inference. We also define k-Shape as the calculation of the rate of similarity of time-series data. In evaluation, we use sensor data of temperature and humidity and compare the sum of the square of the errors of four methods, between original data and outputs of each methods, (1) APREP-S, (2) mean of the entire data, (3) mean of the around-the-target imputation data, and (4) spline interpolation. It is verified that APREP-S is a more suitable method for humidity data than temperature data. preprocessing method. we consider the reason is that humidity data have more changing points.

Content from these authors
© 2020 by the Information Processing Society of Japan
Previous article Next article
feedback
Top