![IconGems-Print](images/IconGems-Print.png)
423 – Disclosure Avoidance, Data Privacy, and Perturbed Data: Protecting Sensitive Data
Measuring the Degree of Difference in Perturbed Data
Marlow Lemons
U.S. Census Bureau
Aref Dajani
U.S. Census Bureau
Jiashen You
U.S. Census Bureau
John Jordan
U.S. Census Bureau
Statistical agencies have an official responsibility to mitigate disclosure to protect respondent identity. Data swapping is a common technique to achieve that effort. Consequently, it is important to evaluate the quality of the perturbed data. We investigate several metrics to quantify the degree of discrepancy between two tabulated data sets. This list ranges from established statistics such as the Gini index to Shannon entropy and more heuristic metric like the effective swap rate. A simulation study compared distributions of these statistics under different settings of swap rate and skewness. Applications to the one-year American Community Survey are presented.