Host: Japan Society for Fuzzy Theory and Intelligent Informatics
Because of the rapid increase in web pages recently, it has become very difficult to discover suitable web pages we really want using common web mining techniques based on only text information. In this paper, we present an algorithm for discovering suitable web pages using the notion of the web community computed from the analysis of the link structure of the web and the set of sample pages given as users input. The experiments show that our algorithm discovers more suitable web pages closely related to the input than the related method for many sample topics. We also discuss how to apply the PageRank measure to improve our algorithm so that it can appropriately take care of unwanted advertisement pages in the output.