Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Related Term Collection
YASUHIRO SASAKISATOSHI SATOTAKEHITO UTSURO
Author information
JOURNAL FREE ACCESS

2006 Volume 13 Issue 3 Pages 151-175

Details
Abstract
This paper proposes the related term collection problem and its solution.The related term collection problem is defined as collecting a dozen of technical terms that are closely related to a given seed term.In order to solve this problem, we use the Jaccard coefficient or the x2 statistics on the Web, which is calculated by the search engine hits, for measuring relatedness between the given seed term and a candidate term.These measures also verify that the candidate term is a technical term.We have implemented a related term collection system, which consists of two modules. The first module collects candidate terms from the web pages that are retrieved by a search engine.The second module selects the terms that are closely related to the given term by using one of the above two measures.Experimental results show that the system can collect a dozen of closely related terms of the given term.
Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top