Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 472 - Winners: Business and Economic Statistics Student Paper Awards
Type: Topic Contributed
Date/Time: Wednesday, August 10, 2022 : 2:00 PM to 3:50 PM
Sponsor: Business and Economic Statistics Section
Abstract #322188
Title: Differentially Private Heavy-Tailed Synthetic Data
Author(s): Tran Tran* and Matthew Reimherr and Aleksandra B. Slavkovic
Companies: The Pennsylvania State University and Penn State University and Penn State University
Keywords: differential privacy; synthetic data; heavy-tailed data; longitudinal business database
Abstract:

The Longitudinal Business Database by the U.S Census is an invaluable resource for economic research, but it contains a great amount of sensitive information about all U.S. firms. This situation warrants releasing a synthetic version of the data to protect firms' privacy while ensuring its usability for research activities. Differential privacy provides a framework for strong provable privacy protection against arbitrary adversaries while allowing the release of summary statistics and synthetic data. However, generating synthetic heavy-tailed data with a formal privacy guarantee while preserving high levels of utility remains a challenge for data curators and researchers. We propose the K-Norm Gradient Mechanism (KNG) in the setting of quantile regression for DP synthetic data generation. The proposed methodology offers the flexibility of the well-known exponential mechanism while adding less noise. We also propose implementing KNG in a stepwise and sandwich order, such that new quantile estimation relies on previously sampled quantiles, to more efficiently use the privacy-loss budget. We show that the proposed methods can achieve better data utility relative to the original KNG at


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program