Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 6 - JSSAM Special Issue: Privacy, Confidentiality, and Disclosure Protection
Type: Invited
Date/Time: Sunday, August 7, 2022 : 2:00 PM to 3:50 PM
Sponsor: Journal of Survey Statistics and Methodology
Abstract #320523
Title: A Semiparametric Multiple Imputation Approach to Fully Synthetic Data for Complex Surveys
Author(s): Mandi Yu and Yulei He and Trivellore Eachambadi Raghunathan*
Companies: National Cancer Institute and National Center for Health Statistics and University of Michigan
Keywords: Statistical confidentiality limitation; Combining rule; Complex survey; Ridge-penalized logistic regression; Alternating conditional expectation
Abstract:

Fully synthetic data is an effective statistical approach for reducing data disclosure risk. This article extended the two-stage imputation to simultaneously impute item missing values and generate fully synthetic data. A new combining rule for making inferences using data generated in this manner was developed. Two semiparametric missing data imputation models were adapted to generate fully synthetic data for skewed continuous variable and sparse binary variable respectively. The proposed approach was evaluated using simulated data and real longitudinal data from the Health and Retirement Study. The proposed approach was also compared with two existing synthesis approaches: (1) parametric regressions models as implemented in IVEware; and (2) non-parametric classification and regression trees as implemented in synthpop package for R using real data. The results show that high data utility is maintained for a wide variety of descriptive and model-based statistics using the proposed strategy. The proposed strategy also performs better than existing methods for sophisticated analyses such as factor analysis.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program