Total Quality Science
Online ISSN : 2189-3195
ISSN-L : 2189-3195
An Extension of Semi-supervised Boosting to Multi-valued Classification Problems
Yuta Sakai Kazuki YasuiKenta MikawaMasayuki Goto
Author information
JOURNAL FREE ACCESS

2021 Volume 6 Issue 2 Pages 60-69

Details
Abstract

Generally, statistical learning using sufficient training data enables a highly accurate classification. However, it is sometimes difficult to collect sufficient training data for constructing an accurate classifier. In particular, for classification problems, correct labels corresponding to each feature vector are required. Therefore, semi supervised learning that uses not only labeled training data but also a large amount of unlabeled data for acquiring an accurate classifier has recently received attention. In a semi-supervised learning setting, if the distribution of labeled data is biased in each category set, it is difficult to estimate the correct labeling for unlabeled data. Consequently, the classification accuracy is degraded owing to the incorrect labeling. SemiBoost is a type of semi supervised learning method that avoids the above problem and has high performance. However, this method is a binary classification method and cannot be extended directly to handle multi-valued classification problems. In this paper, we propose a method to extend SemiBoost to enable it to perform multi-valued classification by introducing to it the concept of the error correcting output code (ECOC) method. Using the proposed method enables a more accurate labeling for unlabeled data. To verify the effectiveness of our proposed method, we conducted simulation experiments by using the data from the UCI Machine Learning Repository. The experimental results showed that the proposed method is effective for biased data. In addition, the classification results when the ratio of bias data was changed are shown and discussed.

Content from these authors
© 2021 The Japanese Society for Quality Control
Previous article
feedback
Top