Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
Privacy-Preserving Multiple Linear Regression of Vertically Partitioned Real Medical Datasets
Hiroaki KikuchiChika HamanagaHideo YasunagaHiroki MatsuiHideki HashimotoChun-I Fan
Author information
JOURNAL FREE ACCESS

2018 Volume 26 Pages 638-647

Details
Abstract

This paper studies the feasibility of privacy-preserving data mining in epidemiological study. As for the data-mining algorithm, we focus on a linear multiple regression that can be used to identify the most significant factors among many possible variables, such as the history of many diseases. We try to identify the linear model to quantify the most significant cause of death from distributed dataset related to the patient and the disease information. In this paper, we have conducted an experiment using a real medical dataset related to a stroke and attempt to apply multiple regression with six predictors of age, sex, the medical scales, e.g., Japan Coma Scale, and the modified Rankin Scale. Our contributions of this paper include (1) to propose a practical privacy-preserving protocol for linear multiple regression with vertically partitioned datasets, (2) to show the feasibility of the proposed system using the real medical dataset distributed into two parties, the hospital who knows the technical details of diseases while patients are in the hospital, and the local government who knows the resident even after the patient has left hospital, (3) to show the accuracy and the performance of the PPDM system which allows us to estimate the expected processing time when an arbitrary number of predictors are used and (4) to study the complexity of the extended models of vertically partition.

Content from these authors
© 2018 by the Information Processing Society of Japan
Previous article Next article
feedback
Top