Joho Chishiki Gakkaishi
Online ISSN : 1881-7661
Print ISSN : 0917-1436
ISSN-L : 0917-1436
Introduction of ComeJisyo and investigation of typographical errors contained in medical information
Kaoru SAGARA
Author information
JOURNAL FREE ACCESS

2014 Volume 24 Issue 2 Pages 204-209

Details
Abstract
 The increasingly widespread use of electronic health records systems mean that large amounts of medical information are accumulated in text format. In order to support computer readability and analysis of text by natural language processing (NLP) of medical information, we have released ComeJisyoV5-1 with 77,760 entries of medical terms for morphological analysis of text.
 Furthermore, in order to aid NLP of medical text, knowing the kinds of typographical errors present is significant so that they can be reduced and that the terminology can still be interpreted by the computer. Overall, 53 kinds of typographical errors were found and analyzed following an ethically approved investigation of medical information in two facilities.
 As a result, in the two-step conversion process, whereby typed Roman alphabetic characters are converted into kana and then exchanged for selected kanji, most errors occurred in the process of exchanging kana for kanji resulting in 46 terms being converted into homophones or another same-sounding kanji.
Content from these authors
© 2014 Japan Society of Information and Knowledge
Previous article Next article
feedback
Top