Activity Number:
|
591
- Synthetic Data and Data Disclosure
|
Type:
|
Contributed
|
Date/Time:
|
Wednesday, August 1, 2018 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Government Statistics Section
|
Abstract #329778
|
|
Title:
|
A Top-Down Algorithm for Releasing Differentially Private Hierarchical Multi-Dimensional Contingency Tables with Exact Constraints
|
Author(s):
|
Robert Ashmead* and John M Abowd and Simson Garfinkel and Michael Hay and Dan Kifer and Philip Leclerc and Ashwin Machanavajjhala and Ryan McKenna and Gerome Miklau and Brett Moran and William Sexton
|
Companies:
|
U.S. Census Bureau and U.S. Census Bureau and U.S. Census Bureau and Colgate University and Penn State University and U.S. Census Bureau and Duke University and , University of Massachusetts, Amherst and University of Massachusetts, Amherst and U.S. Census Bureau and U.S. Census Bureau
|
Keywords:
|
Differential Privacy;
Disclosure Avoidance;
Decennial Census
|
Abstract:
|
Mechanisms that satisfy differential privacy provide quantifiable guarantees of privacy for the release of statistical data and are increasingly being utilized in both private industry and official statistics. One of the most prevalent data release formats is the multi-dimensional contingency table in which data from a population or sample are cross-classified according to categorical variables. In a case like the Decennial Census, where many tables are released, it is possible to apply a noise perturbation method to each of the released tables separately. However, such a method would produce tables that were not consistent with one another, which is an undesirable property. Motivated by the 2020 Decennial Census, this research considers a top-down algorithm for the release of differentially private hierarchical multi-dimensional contingency tables in order to maintain consistency between tables and accuracy across the geographical hierarchy. Further, we incorporate sets of exact constraints into the released tables to satisfy legal mandates or publicly released information. We discuss the steps of the top-down algorithm and illustrate the accuracy using a real data example.
|
Authors who are presenting talks have a * after their name.