Journal of Japan Society for Fuzzy Theory and Intelligent Informatics
Online ISSN : 1881-7203
Print ISSN : 1347-7986
ISSN-L : 1347-7986
Original Papers
Fuzzy c-Means Classifier for Large Scale Data
Hidetomo ICHIHASHIAkira NOTSUKatsuhiro HONDA
Author information
JOURNAL FREE ACCESS

2010 Volume 22 Issue 6 Pages 792-803

Details
Abstract
This paper discusses the application of the fuzzy c-means (FCM) based classifier to large scale data sets. The first type of the large scale data set is the one containing a huge number of samples (patterns). The number can be reduced by sampling, but the accuracy of the classifier on the test set may deteriorate, and the accuracy on the available data worsens. The FCM classifier uses covariance matrices whose size does not increase with the number of training samples, and the training time is proportional to the number of samples. Comparing with the support vector machine (SVM) classifier, which is known as one of the highest performance classifiers, the paper shows that the FCM classifier nearly attains the accuracy of SVM and surpasses it in the training time and the testing time. If the feature dimension of the samples is relatively small or the dimension can be reduced by principal component analysis (PCA), the training of the FCM classifier converges in a short period of time. But, if the feature dimension is large enough, the covariance matrices can't be stored in the computer memory and the computation is infeasible. So, the paper proposes a modified algorithm to cope with high dimensional feature data. As an example, a subset of COREL image database is used to compare the performance with the approach using PCA data set compression.
Content from these authors
© 2010 Japan Society for Fuzzy Theory and Intelligent Informatics
Previous article Next article
feedback
Top