Abstract:
|
Motivated by the lack of formal privacy protections in place on CDC WONDER, recent work has proposed the creation of a Synthetic CDC WONDER based on a differentially private Poisson-gamma modeling framework, which samples values from the posterior predictive distribution associated with modeling event count data with a Poisson-likelihood and assuming a gamma prior on the underlying event rate. The Poisson-gamma framework incorporates (and relies on) publicly available information such as estimates of the underlying population sizes and event rates to improve its utility and protects the sensitive data by increasing the informativeness of the prior distribution. The goal of this work is to present a comparison of the Poisson-gamma framework and the Laplace mechanism for the purpose of generating a synthetic dataset comprised of the 26,000 cancer-related deaths in Pennsylvania counties from 1980. We show that while the Poisson-gamma framework preserves inference on quantities such as urban/rural and black/white disparities in death rates, the Laplace mechanism – when forced to produce non-negative values – can fail to preserve both the magnitude and direction of such disparities.
|