Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Ensemble Document Clustering Using Weighted Hypergraph Generated by NMF
HIROYUKI SHINNOUMINORU SASAKI
Author information
JOURNAL FREE ACCESS

2007 Volume 14 Issue 5 Pages 107-122

Details
Abstract
In this paper, we propose a new ensemble clustering method using Non-negative Matrix Factorization (NMF).
NMF is a kind of the dimensional reduction method which is effective for high dimensional and sparse data like document data. NMF has the problem that the result depends on the initial value of the iteration. The standard countermeasure for this problem is that we generate multiple clustering results by changing the initial value, and then select the best clustering result estimated by the NMF decomposition error. However, this selection does not work well because the NMF decomposition error does not always measure the accuracy of the clustering.
To improve the clustering result of NMF, we propose a new ensemble clustering method. Our method generates multiple clustering results by using the random initialization of NMF. And they are integrated through the weighted hypergraph, which can directly be constructed through the result of NMF, instead of the traditional binary hypergraph.
In the experiment, we compared the k-means, NMF, the ensemble method using the standard hypergraph and the ensemble method using the weighted hypergraph (our method). Our method achieved best.
Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top