82 – Monte Carlo Methods: Models and Tests
Alternative Variance Estimators for Data Perturbed for Confidentiality Protection
Jianzhu Li
Westat
Michael D. Larsen
The George Washington University
Tom Krenzke
Westat
Laura Zayatz
U.S. Census Bureau
One method of protecting confidentiality of tabular data is to apply random perturbation on select variables in the underlying microdata. Perturbation variability needs to be appropriately accounted for in variance estimation for estimates derived from a data file altered through random perturbation. In previous work, we had studied methods for estimating variances using a single perturbed data set, and developed a variance estimator that incorporates a variance component associated with data perturbation. In this paper, we further explore three alternative approaches that can be considered in comparison to the initial estimator, with a goal of increasing the stability of the variance estimation, especially when estimates are extreme. The first alternative modifies the initial estimator through use of multiple perturbed data sets. The second alternative is a limited bootstrap approach that can be done by conducting the perturbation of the bootstrap samples multiple times, producing the replicate estimates, and subsequently computing the variance among the replicate estimates. The third alternative adjusts the initial estimator through the idea of small area estimation. Computational aspects of estimators are discussed. A simulation study was conducted to evaluate and compare the performance of the initial and alternative variance estimators using select variables in two test sites from the American Community Survey 2005-2009 sample data. The results are summarized in terms of the coverage rates and margin of errors of the estimators.