大規模ディレクトリサービスからの新出語抽出に関する考察

竹下 和敏; 高間 康史

doi:10.14864/fss.21.0.22.0

21st Fuzzy System Symposium

Session ID : 7B2-2

DOI https://doi.org/10.14864/fss.21.0.22.0

Conference information

Host: Japan Society for Fuzzy Theory and Intelligent Informatics

Co-host: International Fuzzy Systems Association, IEEE Computational Intelligence Society Japan Chapter

7B2.

Consideration of New Word Extraction from Large-scale Directory Service

*Kazutoshi Takeshita, Yasufumi Takama

Author information

Keywords: Information Retrieval, Directory Service, Thesaurus, New Word

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

Recently documents of various specific fields exist on the Web, which are updated frequently. As a result, there exist many fields-specific new words that are not listed in dictionaries. When a document including fields-specific new words is processed with computers for the purposes of indexing and information extraction, the treatment of new words becomes a problem. This paper proposes a method for extracting new words from category names in a large-scale Web dictionary service. The method is based on several characteristics of a word, such as the number of hits in Google search, the number of categories containing the word in the directory service, and the part-of-speech pattern. The experimental results show their effectiveness for extracting new words.

Corresponding author

Register with J-STAGE for free!