536 – Design and Analysis Options for Discrete Data
Balancing Use of Weights, Predictions, and Locality Effects in a Model-Assisted Constrained Hot Deck Approach for Perturbation
Tom Krenzke
Westat
Jianzhu Li
Westat
Laura Zayatz
U.S. Census Bureau
This paper focuses on applying a random perturbation approach that protects microdata for the purpose of releasing data to the public. The classical challenge is to balance the need to reduce disclosure risk and retain data utility. An approach has been developed that provides the data producer flexibility to achieve the balance. Hot deck cells are formed from sampling weights, model predictions and/or covariates, the locality of the target records, and categorized bins of the target variable. Expanding or contracting the bin sizes allows the data producer the flexibility to control the distance between original and perturbed values. An evaluation was conducted to study the impact of the bin categories, sampling weights, model predictions and locality effects using the American Community Survey 2005-2009 sample data.