2015 Volume 38 Issue 1 Pages 53-57
In use of a claims database for a study, an inaccurate diagnosis of breast cancer based on claims data may lead to invalid study results. The aim of this study was to assess the accuracy of definitions for identifying breast cancer cases from the Japanese claims database. The study cohort consisted of women with no prior cancer-related history, from the claims data at a single institution between January 1 and December 31, 2011. We developed 14 definitions for identifying breast cancer based on claims data, using a combination of diagnosis codes and treatment procedure codes. We calculated the sensitivity, specificity, and positive predictive value (PPV) of each definition, compared to cases identified from the standardized hospital-based cancer registry as a standard reference. A total of 50056 women were included in the study cohort from the claims database. We identified 633 breast cancer cases from the cancer registry. Of 14 definitions, 12 exhibited higher sensitivity than 90%, while the others exhibited lower sensitivity than 40%. The specificities of all definitions were high (≥99%), and the PPVs were between 65.8 and 90.7%. We selected the most optimal definition obtained from combinations of diagnosis and cancer treatment codes (surgery, chemotherapy, medication, radiation procedure), which had high values for sensitivity (90.4%), specificity (99.8%), and PPV (87.3%). Definitions obtained via combinations of the diagnosis codes and procedure codes could be used to accurately identify breast cancer cases from the claims database. Further studies in a multi-institutional setting are planned to confirm our results.