Recently, we can easily have huge database with the development of computer network. Accordingly, it becomes difficult for users to extract knowledge from the database. In this paper, we focus on data mining, especially classification. In the real-world data mining, missing value problem is happened, for example, speech containing noises, facial occlusions, and so on. When the test sample have missing values, classification systems can not classify that. In previous studies, various imputation methods have been developed. Previous imputation methods were developed to solve the missing value problem with lots of explanatory variable, even if some explanatory variables are ineffective for imputation. It has been said that using lots of variable deteriorates in learning efficiency, thus we believe that imputation methods should be developed considering relations among explanatory variables. Moreover, it is effective considering not only relations among explanatory variables but also between the test sample and each of the training sample. Therefore we propose the imputation method by using Bayesian network with weighted learning. Through the experiments, we could confirm that the proposed method imputed missing values with approximate values, and a classification system successfully classified the test sample, in which missing values were imputed by the proposed method, in comparison with some conventional methods.
抄録全体を表示