Agricultural Information Research
Online ISSN : 1881-5219
Print ISSN : 0916-9482
ISSN-L : 0916-9482
Original Articles
Development and Evaluation of Rich Linguistic Resources for Automatic Indexing of Agricultural Literature
Akane TakezakiTakashi HosobamiDaisuke HoryuTakuji Kiura
Author information
JOURNAL FREE ACCESS

2010 Volume 19 Issue 1 Pages 10-15

Details
Abstract

We compiled rich linguistic resources (such as a morpheme dictionary and a stop list) for automatic indexing of agricultural literature. Terms from agricultural dictionaries, registered plant variety names, and new terms extracted from records in the Japan Agricultural Science Index (JASI) were incorporated into an agricultural morpheme dictionary. The addition of new terms identified by morpheme analysis of JASI records decreased the number of unknown words in subsequent analyses. Combining general and enriched agricultural morpheme dictionaries left fewer unknown words extracted by morpheme analysis than using the general morpheme dictionary only. One-roman letters with the exception of atomic symbols, SI units, reference terms, Indo-Arabic numerals, and numerals were chosen as stop words. Two-thirds of manually indexed terms corresponded completely or partially to automatically indexed terms when both the enriched morpheme dictionary and stop list were used. These results suggest that compiled linguistic resources can improve morpheme analysis and automatic indexing.

Content from these authors
© 2010 Japanese Society of Agricultural Informatics
Previous article
feedback
Top