Online Program Home
My Program

Abstract Details

Activity Number: 119 - SPEED: Government and Health Policy
Type: Contributed
Date/Time: Monday, July 30, 2018 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #329596
Title: DataSifter: Statistical Obfuscation of Electronic Health Record and Other Sensitive Data Sets
Author(s): Nina Zhou* and Simeone Marino and Lu Wang and Yiwang Zhou and Ivo Dinov
Companies: University of Michigan and Statistics Online Computational Resource, University of Michigan and University of Michigan and University of Michigan and Statistics Online Computational Resource, University of Michigan
Keywords: Open science; Electronic Health Record; Non-parametric imputation; Differential privacy; Artificial missingness
Abstract:

Currently, there are no reliable and effective mechanisms to share clinical data containing no clearly identifiable PHI without altering the data structure. We developed a novel statistical protocol, DataSifter, for de-identification of structured clinical data. It advocates Open Science by allowing Health System Administrators to share clinical data requested by researchers. The method performs iterative data manipulation that stochastically selects, nullifies, imputes, and exchanges feature values among the subjects. This process heavily relies on non-parametric imputation for mixed-type data to preserve the joint distribution. At each step, the DataSifter generates a complete dataset that closely resembles the original cohort. However, on an individual level, the feature values are substantially altered. This procedure drastically reduces the risk for subject re-identification by stratification, as meta-data for all subjects is repeatedly transformed, still preserving the overall population characteristics and data structure. Validation of the DataSifter on simulated and EHR case studies generated promising results in terms of privacy protection and inference reliability.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program