Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Construction of Related Terms Thesauri from the Web
TAKESHI SAKAKIYUTAKA MATSUOKOKI UCHIYAMAMITSURU ISHIZUKA
Author information
JOURNAL FREE ACCESS

2007 Volume 14 Issue 2 Pages 3-31

Details
Abstract
This paper describes a method to costruct related terms thesauri automatically based on Web information. We utilize Web search engine to obtain word co-occurrence information and propose a new efficient similarity metrics applying x2 value to solve problems of the existing methods. We also introduce a new method to identify related terms using word-clustering. We do word-clustering on that assocative network to identyfy related terms using latest clustering methods, “Newman method”. We make evaluations and show the effectiveness of our approach using sets of related terms extracted from a corpus and a current thesaurus.
Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top