Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 167 - Data Mining and Econometrics
Type: Contributed
Date/Time: Tuesday, August 10, 2021 : 10:00 AM to 11:50 AM
Sponsor: Social Statistics Section
Abstract #318745
Title: Generating Differentially Private Synthetic Heavy-Tailed Data
Author(s): Tran Tran* and Matthew Reimherr and Aleksandra Slavkovic
Companies: Pennsylvania State University and Penn State University and Penn State University
Keywords: differential privacy; synthetic data; heavy-tailed data
Abstract:

Despite the need for broad dissemination of sensitive microdata, generating differentially private synthetic heavy-tailed data with high levels of utility remains a challenge for researchers. Differential privacy (DP) provides a framework for strong provable privacy protection against arbitrary adversaries while allowing the release of summary statistics and synthetic data. We propose using the K-Norm Gradient Mechanism (KNG) in the setting of quantile regression for DP synthetic data generation. The proposed methodology offers the flexibility of the well-known exponential mechanism while adding less noise. We also demonstrate how to improve privacy-loss budget usage by utilizing KNG with the so-called stepwise and sandwich schemes for quantile estimation relying on the previously estimated DP quantiles. Through a simulation study and an application on the Synthetic Longitudinal Business Database (SynLBD), we demonstrate that the proposed methods can achieve good data utility with a low privacy-loss budget.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program