Abstract:
|
Differential Privacy (DP) methods require the introduction of additional randomness, beyond sampling to protect privacy. With DP, the privacy mechanism itself is not secret, allowing the privacy mechanism to be incorporated into statistical analysis. However, the resulting likelihood function requires integrating over the large space of private databases. We develop general purpose methods for practitioners to obtain valid statistical inference in this setting. In particular, we propose customized MCMC procedures, which efficiently sample from the posterior distribution conditional on the privatized output. By augmenting the MCMC procedure with the latent database, we ensure that Gibbs updates have acceptance probability bounded below in terms of the privacy constraint, ensuring comparable convergence to the nonprivate case. When the privacy parameter epsilon is small, our method is highly efficient, but when epsilon is large, the acceptance probability of the sampler drops. There are also models where a Gibbs sampler has poor mixing even without the complication of privacy. We propose more sophisticated data augmentation and tempering schemes to improve convergence in these cases.
|