単語の意味に関する概念ベースの類似性判別能力からの最適構成

石川 勉; 井澤 潤次朗; 笠原 要; Kaname Kasahara

doi:10.11517/jjsai.13.3_470

抄録

This paper discusses analytically and experimentally an optimal structure of a large scale knowledge base of words, which is automatically constructed from machine-readable dictionaries. In this knowledge base, each word is represented by a series of weighted keywords. The keywords have some relationship with the word, and the weights of the keywords represent the degree of the strength of the relationship between the word and keywords. In constructing this kind of knowledge base, it is important to select the optimal set of keywords used to represent every word in the knowledge base, considering the ability of measuring the semantic similarity between words. Our analysis, using a simplified model of the knowledge base based on probability theory, has shown that a smaller keyword set using the higher level keyword in the conceptual hierarchy becomes optimal when the size of the knowledge base, namely, the total number of words in it or the average number of keywords per word, becomes large. On the other hand, an experiment using six knowledge bases modified from the previously constructed knowledge base of 40000 Japanese daily-used words has verified the existence of the optimal keyword set. This means that the above mentioned analysis is useful in the design of a knowledge base in which each word is generally represented by a vector. In addition, we have found, from both a subjective evaluation based on human judgment and a newly proposed objective evaluation using a published synonym dictionary, that a set of about 2000 keywords is optimal for constructing a knowledge base of this size.

著者関連情報

お気に入り & アラート

閲覧履歴

発行機関からのお知らせ

PDF閲覧時に認証を求められる記事がございます（発行後2年間）が，人工知能学会の個人会員は無料で閲覧可能です．認証のための購読者番号やパスワードは会員マイページ（ユース会員の場合はジュニア・ユース会員サイト）にログインし「お知らせ」にてご確認下さい（会員情報管理システムとオンラインで連携していないため，パスワードは同システムとは異なります．また，認証情報の更新は偶数月の月末に実施しております．新規入会された方は利用できるまでしばらくお待ちください）．個人会員以外は，アマゾンにて冊子版あるいはKindle版を購入いただけます．

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）