Journal of Japan Society for Fuzzy Theory and Intelligent Informatics
Online ISSN : 1881-7203
Print ISSN : 1347-7986
ISSN-L : 1347-7986
Original Papers
Retrieval of Personal Web Pages Based on Web Page Clustering
Takahiro HAYASHISyo KATAHIRAAtsushi INUZUKARikio ONAI
Author information
JOURNAL FREE ACCESS

2006 Volume 18 Issue 2 Pages 161-172

Details
Abstract
This paper proposes and evaluates a method for extracting personal web pages from a large number of unclassified web pages. We can use the method as a content filtering method for reputation searches. To extract personal pages from unclassified pages, the method focuses on four kinds of text features that appear at a personal page. The method quantitatively measures these features for each page and divides the pages into plural groups using k-means clustering based on the results of the measuring. From the groups the method finds groups that consist of personal web pages. We have evaluated the search performance of the method by measuring precisions. Experimental results have shown the average performance of the method is 2.1-times higher than the one of a keyword-based search engine.
Content from these authors
© 2006 Japan Society for Fuzzy Theory and Intelligent Informatics
Previous article Next article
feedback
Top