Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 3 - Data Privacy: Statisticians’ Perspective
Type: Invited
Date/Time: Sunday, August 8, 2021 : 1:30 PM to 3:20 PM
Sponsor: SSC (Statistical Society of Canada)
Abstract #316602
Title: Understanding Risk Measures for Synthetic Data Sets
Author(s): Anne-Sophie Charest*
Companies: Université Laval
Keywords: privacy; synthetic data; confidentiality; risk measures; differential privacy

Sharing synthetic data instead of the original data is now a relatively common way of protecting the confidentiality of the individuals appearing in a dataset. Research on how to best generate these synthetic datasets is growing fast, and proposals now include various joint modeling approaches, fully conditional modelling strategies as well as complex deep learning methods. But measuring the confidentiality guarantee of such synthetic datasets is still quite tricky. Some measures, such as differential privacy, can be quantified a priori and relate to the process by which the synthetic dataset will be created. Other measures are computed post-hoc on the datasets to be released. Some of those take into account the process by which the synthetic data were generated, such as the Bayesian risk measure proposed by Reiter and collaborators. Others, such as the CAP statistic, do not. In this presentation, I will give an overview of the different proposed risk measures for synthetic data and discuss how they relate to each other, and what they really measure. This will lead us to ponder the fine line between inference and inferential disclosure.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program