Advanced Biomedical Engineering
Online ISSN : 2187-5219
ISSN-L : 2187-5219
Development of a New Method to Trace Patient Data Using the National Database in Japan
Tomoya MyojinTatsuya NodaShinichiro KuboYuichi NishiokaTsuneyuki HigashinoTomoaki Imamura
Author information
JOURNAL OPEN ACCESS
Supplementary material

2022 Volume 11 Pages 203-217

Details
Abstract

The National Database of Health Insurance Claims and Specific Health Checkups of Japan (NDB) is a comprehensive database containing health insurance claim information. The structure of the NDB complicates long-term cohorts for two main reasons. First, the NDB data are stored on a per-claim basis. Second, the NDB is a billing-focused record structure. Therefore, the objective of this study was to use ID0 to modify the data structure to allow for long-term cohorts, provided that the data volume is not increased and the runtime per data year is maintained within one month. The NDB uses two primary keys (ID1 and ID2) made from hash values that mask personally identifiable information. ID0 is our recently developed key from ID1 and ID2, which improves patient-matching efficiency with excellent long-term tracing performance. Our study used claim data with filing dates between April 2013 and March 2016 to trace hospitalizations of one month or longer, including outpatient care, in three steps. In Step 1, claims were transferred to a CD-record format. As some diagnosis procedure combination (DPC) claim records contain a mixture of overlapping comprehensive and piece-rate data, we sorted and reorganized them. In Step 2, pharmacy and medical outpatient claims were integrated using the ID0 key, the medical institution code for issuing a prescription, and the prescription issue date. In Step 3, the transferred data were combined and converted from consecutive hospitalization days into sequences based on ID0, the medical institution code, and hospital ward classification. Consequently, the size of the originally extracted comma-separated variable dataset for three years (approximately 10.5 TB) was reduced to an approximately 6 TB main database file that was usable for processing. The process took approximately three months. With similar conventional methods, the data size was 30 times larger, and it took more than seven months to process a year's worth of data. In addition, to demonstrate the application of this method, we conducted a six-year mortality cohort for all Japanese citizens. Our technique makes it easy to perform follow-up and longitudinal cohort surveys while accurately tracing patient data in large-scale medical claims databases.

Content from these authors
© 2022 Japanese Society for Medical and Biological Engineering

Copyright: ©2022 The Author(s). This is an open access article distributed under the terms of the Creative Commons BY 4.0 International (Attribution) License (https://creativecommons.org/licenses/by/4.0/legalcode), which permits the unrestricted distribution, reproduction and use of the article provided the original source and authors are credited.
https://creativecommons.org/licenses/by/4.0/legalcode
Previous article Next article
feedback
Top