Document map construction and keyword selection based on local PCA

Hideki Wada; Katsuhiro Honda; Akira Notsu; Hidetomo Ichihashi

doi:10.14864/softscis.2008.0.942.0

SCIS & ISIS 2008

Session ID : FR-E3-2

DOI https://doi.org/10.14864/softscis.2008.0.942.0

Conference information

Host: Japan Society for Fuzzy Theory and intelligent informatics

Document map construction and keyword selection based on local PCA

*Hideki Wada, Katsuhiro Honda, Akira Notsu, Hidetomo Ichihashi

Author information

Keywords: cluster analysis, regression analysis, Kansei information

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

Document map construction is a useful approach to intuitive text mining, in which mutual relations among text documents composed of many keywords are characterized in a 2-D map. Usually, text documents are ﬁrst preprocessed into numerical weights such as tf-idf weights by considering term frequency and inverse document frequency, and then, dimension reduction techniques, such as principal component analysis (PCA), are performed for constructing low dimensional plots of multivariate data. This paper considers using a linear fuzzy clustering-based variable selection mechanism for selecting keywords that are useful for characterizing documents, in conjunction with applying document clustering for extracting multiple linear sub-structures. In the approach, meaningful keywords are selected in each cluster (linear sub-structure) and mutual relations among documents are represented in simple linear sub-spaces.

Corresponding author

Register with J-STAGE for free!