Online Program

Return to main conference page
Saturday, May 19
Public Health Applications
Sat, May 19, 1:15 PM - 2:45 PM
Lake Fairfax A

A New Framework for Re-identification Risk Estimation in Complex Healthcare Data (304563)

*Lei Li, George Mason University 
Anand N. Vidyashankar, George Mason University 

Keywords: Re-identification Risk, Risk Metric, Divergence Based Estimation Methods, Zero-inflated Models

Estimating the risk of re-identifying individuals from de-identified healthcare data is critical due to regulatory and contractual restrictions. In recent years, statistical models and metrics have been developed to understand and evaluate the risk of re-identification. However, when the statistical model is misspecified, the behavior of the risk estimates and its implications on the policy are largely unknown. In this presentation, we provide new composite metrics, statistical models, and methods for estimating risk of re-identification from de-identified data and study the theoretical properties of the metrics. We provide numerical algorithms for estimating the risk and evaluate the effects under model misspecification. We illustrate our findings with detailed description on the existing policy on several publicly available healthcare data.