Online Program Home
My Program

Abstract Details

Activity Number: 619
Type: Contributed
Date/Time: Wednesday, August 3, 2016 : 2:00 PM to 3:50 PM
Sponsor: Section on Bayesian Statistical Science
Abstract #318986 View Presentation
Title: Privately Preserving Algorithms to Release Sparse High-Dimensional Histograms
Author(s): Bai Li* and Rebecca Steorts
Companies: and Duke University
Keywords: differential privacy ; statistical disclosure control ; data release

Data privacy is a difficult and poorly understood problem that is further exacerbated by the large amount of data being collected, stored, analyzed and shared across multiple domains in the age of "big data". Differential privacy (DP) has emerged as a key rigorous definition of privacy or more specifically as a framework that can inform the design of privacy mechanisms with specified disclosure risk, in presence of arbitrary external information, but it is often criticized for lack of utility. We present two algorithms for releasing sparse high dimensional histograms in a differentially private manner. These algorithms are designed as one step of synthetic data generation, which is a methodology for reducing the risk of disclosing sensitive information when data are released to the public. To ensure privacy, all the entries in a histogram must be perturbed with noise, which is inefficient when the dimension is high. Our algorithms improve the efficiency for sparse high dimensional histograms by working only with the non-zero entries and a small proportion of the zero entries. We also explore the tradeoff between the utility and privacy of the synthetic data.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association