SETSUBUN: Revisiting Membership Inference Game for Evaluating Synthetic Data Generation

Takayuki Miura; Masanobu Kii; Toshiki Shibahara; Kazuki Iwahana; Tetsuya Okuda; Atsunori Ichikawa; Naoto Yanai

doi:10.2197/ipsjjip.32.757

抄録

Synthetic data generation techniques are promising for anonymizing high-dimensional tabular datasets, and their privacy protection can be evaluated by membership inference attacks. However, the existing evaluation framework has limitations from two perspectives: (1) it cannot evaluate the worst-case because a target sample is chosen randomly; and (2) the decision criterion of an adversary's inference is black box since the adversary conducts membership inference by using machine learning models. In this paper, we propose a framework to overcome the above limitations in a simple and clear fashion. To cope with limitation (1), we introduce a statistical distance to choose a vulnerable target sample. To cope with limitation (2), we propose two interpretable inference methods. One is a method with typical statistics scores, and the other is a method with the Euclidean distance from the target sample. We conduct extensive experiments on two datasets and five synthesis algorithms to confirm the effectiveness of our framework. The experiments show that our framework enables us to evaluate privacy in synthetic data generation techniques more tightly.

著者関連情報

お気に入り & アラート

閲覧履歴

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）