Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 22, Issue 2
Displaying 1-3 of 3 articles from this issue
Preface
Paper
  • Shohei Higashiyama, Kazuhiro Seki, Kuniaki Uehara
    2015 Volume 22 Issue 2 Pages 77-105
    Published: June 16, 2015
    Released on J-STAGE: September 16, 2015
    JOURNAL FREE ACCESS
    With the increasing number of medical documents written in an electronic format, automatic term extraction technologies from unstructured texts have become increasingly important. Particularly, the extraction of medical terms such as complaints and diagnoses from medical records is crucial because they serve as the basis for more application-oriented tasks, including medical case retrieval. For machine-learning-based term extraction, language resources such as lexica and corpora are effective for recognizing expressions that rarely or do not occur in training data. However, the use of lexica by simple word-matching approaches has limited effects because there are compound words that comprise various combinations of constituent terms in medical records. Therefore, this study presents term extraction systems that can exploit language resources by the acquisition and utilization of beneficial terms and constituents from the resources. Our experimental results on the NTCIR-10 MedNLP test collection, which comprises medical history summaries, show increased precision and recall, indicating the effectiveness of the proposed system. Moreover, compared to existing systems developed for the NTCIR-10 MedNLP task, the proposed system achieved optimum performance for complaint and diagnosis recognition, including the classification of extracted terms into modality attributes.
    Download PDF (534K)
  • Tetsuro Sasada, Shinsuke Mori, Yoko Yamakata, Hirokuni Maeta, Tatsuya ...
    2015 Volume 22 Issue 2 Pages 107-131
    Published: June 16, 2015
    Released on J-STAGE: September 16, 2015
    JOURNAL FREE ACCESS
    In natural language processing (NLP), recognizing important terms after word recognition (word segmentation, part-of-speech tagging, etc.) is practical. In general, terms are word sequences and are classified into different types in many applications. A famous example is the named entity that aims to extract information from newspaper articles. This has seven or eight types (named entity classes) such as person name, organization name and amount of money. The definition of important terms depends heavily on the NLP task. We chose term extraction from recipes (cooking procedure texts) as our task. We discuss a process to define terms and types, annotate corpus, and construct a practically accurate automatic recognizer of recipe terms. The recognizer can potentially be applied to search functions that are more intelligent than simple keyword match and symbol grounding researches, wherein we can match videos and language expressions. Based on these backgrounds, in this study, we discuss the definition of a tag set for recipe terms and real annotation work. Furthermore, we present the experimental results of automatic recognition of recipe terms and provide an insight into the number of annotations required for realizing a certain degree of accuracy.
    Download PDF (1121K)
feedback
Top