Abstract
This paper describes a method to costruct related terms thesauri automatically based on Web information. We utilize Web search engine to obtain word co-occurrence information and propose a new efficient similarity metrics applying x2 value to solve problems of the existing methods. We also introduce a new method to identify related terms using word-clustering. We do word-clustering on that assocative network to identyfy related terms using latest clustering methods, “Newman method”. We make evaluations and show the effectiveness of our approach using sets of related terms extracted from a corpus and a current thesaurus.