Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 82 - Contributed Poster Presentations: Government Statistics Section
Type: Contributed
Date/Time: Monday, August 3, 2020 : 10:00 AM to 2:00 PM
Sponsor: Government Statistics Section
Abstract #312767
Title: Generating Fully-Synthetic Discrete Data
Author(s): Sixia Chen and Allshine Chen* and Daniel Zhao
Companies: University of Oklahoma Health Sciences Center and University of Oklahoma, Health Sciences Center and University of Oklahoma Health Sciences Center
Keywords: fully-synthetic; discrete; multiple-imputation; survey sampling; statistical disclosure
Abstract:

Fully-synthetic data is becoming increasingly prevalent with the growing demands of sharing data in private or public domains. The two key measures that must be addressed when creating synthetic data are its utility and risk. When synthesizing discrete data, there is not a well-accepted method to do so, nor quantify the utility and risk. In our study, we use generalized additive modeling with multiple imputation to create fully synthetic data. We compare our results to random forest and classification and regression tree methods using both simulated and real data as template data. We will also describe how we calculate the risk and utility for fully-synthetic discrete data.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program