Japan Journal of Medical Informatics
Online ISSN : 2188-8469
Print ISSN : 0289-8055
ISSN-L : 0289-8055
Original Article
Classification of Discharge Summaries by Text Mining
Hiroki ONOKatsuhiko TAKABAYASHITakahiro SUZUKIHideto YOKOIAtsushi IMIYAYouichi SATOMURA
Author information
JOURNAL FREE ACCESS

2004 Volume 24 Issue 1 Pages 35-44

Details
Abstract

 Objectives: To study the ability of text mining technique for the selection of specific words related to diagnosis and to distinguish the diseases of discharge summaries. Materials and methods: 4,317 discharge summaries in Chiba University Hospital were selected out of 13 representative diseases. Diagnosis related terminological words were extracted by morphological analysis. Thus, the diseases were compared with each other using tf×idf vector space model and important specific words for each disease were selected. Furthermore, we applied the vector space model for new cases and indicated the vector by a radar chart. Results: 7,918 words were selected from cases and 74% of 390 cases were properly diagnosed. The maximum-tree problem and dendrogram method demonstrated reasonable relationships among 13 diseases. Conclusion: These results suggest the possibility that text-mining technique is applicable to the automotive classification of medical documents according to the diagnoses.

Content from these authors
© 2004 Japan Association for Medical Informatics
Previous article Next article
feedback
Top