Online Program Home
My Program

Abstract Details

Activity Number: 153 - Developing Multi-Purpose Imputed or Synthetic Data for Official Statistics
Type: Invited
Date/Time: Monday, July 29, 2019 : 10:30 AM to 12:20 PM
Sponsor: Government Statistics Section
Abstract #300148 Presentation
Title: Developing Synthetic Data from the Economic Census Under Edit and Calibration Restrictions
Author(s): Katherine J Thompson* and Hang Joon Kim
Companies: U.S. Census Bureau and University of Cincinnati
Keywords: synthetic data; mixture models

Many institutions and researchers require access to detailed microdata. Historically, agencies may selectively allow access to the “real data” under strict security restrictions that respect disclosure avoidance regulations. Some programs produce public use microdata samples (PUMS) i.e., highly sanitized versions of real data. Many agencies are investigating methods of developing synthetic microdata as an alternative to PUMS. However, when the collected microdata are subject to regulatory privacy laws, it is often challenging to develop synthetic microdata that preserve complex inter-item relationships and protect the privacy of individual respondents simultaneously. It is especially difficult to main this balance in developing synthetic data for highly skewed economic populations; the information contained in the right tails is indispensable for accurate tabulations and is equally sensitive to disclosure. We present a novel synthetic data generation method designed specifically for skewed multivariate data that preserves key statistical properties of both the unit-level microdata and the tabulated estimates, respecting potential disclosure risk as applicable.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program