Abstract:
|
The U.S. Census Bureau has historically used nearest-neighbor or hot deck imputation to handle missing data for many types of establishment data. Measures such as payroll and revenue tend to be highly skewed, and using these methods removes the need to parametrically model values in imputation models. Recently these methods have been used for multiple imputation (MI), enabling variance estimation via the so-called Rubin’s Combining Rules. The Approximate Bayesian Bootstrap (ABB) is a simple-to-implement algorithm that can be used to make hot deck methods “proper” for MI. With ABB, responding units are bootstrapped before donor selection so that the set of possible donors for a nonrespondent varies across imputed datasets. In concept, ABB should work for nearest neighbor MI; bootstrapping respondents means each nonrespondent’s one “nearest” donor will not be available for every imputation. However, we show that ABB with nearest neighbor does not create a proper MI method, leading to variance underestimation. We illustrate the problem via simulation and with Economic Census data and provide guidance on alternative nearest neighbor MI methods that may be used to overcome this problem.
|