2022 年 13 巻 1 号 p. 77-84
Data publishing that keeps availability and preserves privacy is important. In this case, it is necessary to balance the tradeoff between privacy (security) and availability, because they are not necessarily related if they can be quantified. To consider their tradeoff, it is necessary to set some parameters for data anonymization appropriately. Usually, this step needs trial and errors by hand but it is convenient to automatically make the data private to publish data frequently. Here, we propose an automatic parameter tuning method that keeps availability expected on the data with preserving privacy by using multi-objective optimization whose objective functions are the index of security and availability and design parameters are parameters needed for anonymization. We use Non-dominated Sorting Genetic Algorithm (NSGA)-II to consider the tradeoffs between privacy and availability. Experiments on the diabetes data show that our method keeps availability with preserving privacy and can obtain the non-dominated solutions that satisfy the requirement. Our method includes various anonymization methods such as top-bottom coding, k- anonymization, outlier removal, random replacement, and differential privacy in the framework of multi-objective optimization. In addition, to select the most robust non-dominated solution against attack, we use a record-linking attack to non-dominated solutions, which shows the effectiveness.