Japan Journal of Medical Informatics
Online ISSN : 2188-8469
Print ISSN : 0289-8055
ISSN-L : 0289-8055
Short Notes
Evaluation of an English-Japanese Thesaurus Based on the Analysis of Biomedical Corpora
S KanekoN Fujita
Author information
JOURNAL FREE ACCESS

2005 Volume 25 Issue 6 Pages 475-483

Details
Abstract
 Life Science Dictionary (LSD) is a versatile database of English and Japanese terms based on the quantitative analyses of biomedical corpora. To develop a thesaurus of LSD terms for future application to computer-assisted text mining, we have evaluated the frequency of LSD terms in the literature-based corpora, and mapped the LSD terms to the MeSH tree. Coverage of LSD English terms in a PubMed-based corpus was 80%. In 65,000 MeSH tree terms, LSD-matched terms were 20%, which was increased to 40% in a subpopulation of terms occurred in the English corpus. The MeSH-unmatched LSD terms included abbreviations, verbs, adjectives, adverbs and MeSH-unclassified terms. These results indicate the requirement of new comprehensive thesaurus tree covering complex English-Japanese translations.
Content from these authors
© 2005 Japan Association for Medical Informatics
Previous article Next article
feedback
Top