多数のパラメータを用いるファジィ c 平均識別器の訓練データ数による性能比較

市橋 秀友; 本多 克宏; 野津 亮

doi:10.3156/jsoft.23.254

Abstract

Fuzzy c-means based classifier (FCMC) is a classifier based on clustering approaches. The classification accuracy on training sets can easily be improved by increasing the number of clusters. On the other hand, the accuracy on test sets (i.e., the generalization capability) is not necessarily improved by increasing the number of clusters. Especially when the number of training samples is relatively small, the classifier not only over-fits the data, but also obtains incorrect covariance matrices and cluster centers, since the number of samples in each cluster becomes small. Hence, the test set accuracy deteriorates. The performance of FCMC with 2 clusters in each class when the number of training samples is less than 1000 was already reported. This paper reports the scaling behavior of FCMC by testing with variously-sized training samples. The number of clusters of FCMC is increased up to 8. The number is not very large but FCMC in this paper has many parameters. LibSVM is one of the widely known state of the art tools of SVM classifier for large-sized data sets. The classification accuracy on test set, training time and testing time (i.e., detection time) of FCMC are compared with LibSVM by varying the number of training samples.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!