アンサンブル最近傍距離を用いたラベル無しデータからの分類器学習

松本 瑞季; 鷲尾 隆

doi:10.11517/pjsai.JSAI2020.0_4J2GS201

Abstract

Most of instances in big-data are unlabeled, and the number of the available labeled instances are so limited that semi-supervised learning approaches are not effectively applied in many cases. This fact is one of the main obstacles for effective use of the big-data. In this study, we propose a novel approach to highly efficiently learn an accurate binary classifier from two given unlabeled data sets only. The approach classifies a given instance based on ensemble difference between its nearest-neighbor distances in the two unlabeled data sets. It provides consistent classification results within a constant computation time based on its mathematical background nature. Numerical experiments show high accuracies of the approach close to their upper bounds provided by Bayes error rates.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!