Proceedings of the Fuzzy System Symposium
25th Fuzzy System Symposium
Session ID : 2E1-02
Conference information

Text Document Classification by Fuzzy PCA-guided Robust k-Means
*Katsuhiro HondaTomohiro MatsuiAkira NotsuHidetomo Ichihashi
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Text document classification is a fundamental technique for text analysis such as e-mail filtering and patent retrieval tasks. In this research, fuzzy PCA-based robust k-Means is applied to extraction of document clusters so that each cluster core includes mutually related documents ignoring the effect of noise documents. After quantification of documents by calculating tf-idf weights of frequently used words, fuzzy PCA is performed for constructing connectivity matrix composed of connectivity degrees among documents, and then, cluster structures are intuitively recognized by re-ordering the documents considering the responsibility of documents (degree of non-noise level).

Content from these authors
© 2009 Japan Society for Fuzzy Theory and Intelligent Informatics
Previous article Next article
feedback
Top