Online Program Home
My Program

Abstract Details

Activity Number: 423 - Contributed Poster Presentations: Survey Research Methods Section
Type: Contributed
Date/Time: Tuesday, July 30, 2019 : 2:00 PM to 3:50 PM
Sponsor: Survey Research Methods Section
Abstract #306660
Title: Statistical Disclosure Control with Machine Learning
Author(s): Allshine Chen* and Sixia Chen and Yan Daniel Zhao
Companies: and University of Oklahoma Health Sciences Center and University of Oklahoma Health Sciences Center
Keywords: Statistical Disclosure Control; Imputation; Machine Learning; Modeling; Simulation

Statistical disclosure control (SDC) is a data-masking technique that is used to anonymize survey or record data, which, often, removes the barriers for public release of this data. Rubin proposed the multiple imputation framework (Rubin, 1993) to create multiple synthetic samples of the population for which subjects are de-identified, yet valid statistical inference can be accomplished. Using this multiple imputation framework, we propose to use the tools of modeling and machine learning to create and/or impute the synthetic datasets. We will explore the usefulness of spline models, kernel regression, and regression trees in correlation, concordance, and inference validity when comparing synthetic data to the actual data. We will use both simulated data and real data to accomplish these goals.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program