Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
A Construction of Large-scale Concept-base for Calculation of Degree of Association between Concepts
NORIYUKI OKUMURASEIJI TSUCHIYAHIROKAZU WATABETSUKASA KAWAOKA
Author information
JOURNAL FREE ACCESS

2007 Volume 14 Issue 5 Pages 41-64

Details
Abstract
We human beings associate various words in daily conversation. For example, we naturally associate ‘Tire’, ‘Engine’, ‘Accident’, and so on with ‘Automobile’, and expand contents of conversation by association. Concept-base is the key role for achievement of association mechanism on computers. The meanings of words (concepts) are defined by attributes and weights in Concept-base. As construction method of Concept-base, it is suggested that concepts (about 40000 words) and attributes are picked up from descriptive texts on electronic dictionaries. However, the number of concepts and attributes picked up from dictionaries are small, and Concept-base has some problems about accuracy of association.
In this paper, Concept-base is expanded by coincidence information of general texts such as electronic newspapers based on Concept-base which is constructed from descriptive texts on electronic dictionaries, and it is suggested that a construction method of 120, 000 words scale Concept-base. In extension of Concept-base, first, basic concepts are gotten from descriptive texts on electronic dictionaries about each words which are mentioned in dictionaries and get attributes which have high reliability. Co-occurring words are gotten based on Concept-base which is made from electronic dictionaries as nomination of attributes from electronic newspapers. After this manipulation, improper attributes (noise attributes) are cut off using Degree of Association of attributes, and attributes' quality is made higher. In addition, weights (attributes' weights) of each attributes are given as weights often used in information retrieval and text mining by ascribing Concept-base to virtual documents. At the last, it is shown that accuracy of Concept-base made by suggested method is higher than accuracy of Concept-base made by only dictionaries using experiment of Degree of Association.
Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top