Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 3 - Data Privacy: Statisticians’ Perspective
Type: Invited
Date/Time: Sunday, August 8, 2021 : 1:30 PM to 3:20 PM
Sponsor: SSC (Statistical Society of Canada)
Abstract #316652
Title: Disclosure Control for Microdata: A Mixture Modeling Approach
Author(s): Bei Jiang* and Adrian Raftery and Russell J. Steele and Naisyin Wang
Companies: University of Alberta and University of Washington and McGill University and University of Michigan
Keywords: statistical disclosure control; Mixture modeling; k-Anonymity; microdata; data-augmentation; risk subgroups

The statistical disclosure control (SDC) methods is a class of privacy and utility preserving techniques that deliberately perturb the original data before public release. The goal of SDC methods is to reduce the disclosure risks to an acceptable level, while releasing public-use data sets (known as synthetic data sets) that still perfectly preserve the information from the original data set. In this work, we investigate a mixture-based multiple imputation synthetic method that provides different degrees of perturbation to records/individuals of different levels of disclosure risk. The first step of the method utilizes the concept of k-Anonymity proposed by Sweeney (2002) to divide individuals into subgroups of different disclosure risk levels, using the given risk thresholds. Then, through a data augmentation step, we introduce a tuning mechanism when building imputation models, to further control information loss and hence provide different levels of protection to individuals in different risk subgroups. We illustrate the proposed method using a simulation study and a real data application.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program