Journal of Japan Society for Fuzzy Theory and Intelligent Informatics
Online ISSN : 1881-7203
Print ISSN : 1347-7986
ISSN-L : 1347-7986
R&D Papers
Adaptive Classification of Web Documents by Page Type and Evaluation of Prototype System
Daisuke KANEKOTsuyoshi TAKAYAMATetsuo IKEDAWataru OSANAI
Author information
JOURNAL FREE ACCESS

2006 Volume 18 Issue 2 Pages 319-336

Details
Abstract

Currently, research is in progress to display search results in groups for easy understanding for the users of search engines. Classification uses fixed hierarchical category labels as category names and dynamic clustering gives the category names extracted from search results and keywords. However, these approaches are not satisfactory for users in terms of the following: semantic validity, where category names and the categorizations are easy to understand and not redundant for the users; pertinence, where a group of web documents gives effective information for solutions in a user-selected category; formal validity, where undesired types of pages are not included; minimal cross-category redundancy, where necessary web documents do not exist across categories and target information can be found easily. Based on problem analysis of conventional techniques, this paper proposes a technique of adaptive classification according to the user's selective input with six groups of page types as candidate categories. In addition, a prototype system based on the proposed technique is evaluated by comparison with Yahoo and Vivisimo, representative open engines having functions of grouping and display. Compared with the conventional systems, the prototype system has gained up to 36.7% higher evaluation.

Content from these authors
© 2006 Japan Society for Fuzzy Theory and Intelligent Informatics
Previous article Next article
feedback
Top