Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 25 - Modern Techniques in Handling Missing Data with Challenging Data Structures Including Big and Small Data Files
Type: Topic Contributed
Date/Time: Monday, August 3, 2020 : 10:00 AM to 11:50 AM
Sponsor: Survey Research Methods Section
Abstract #312233
Title: Nearest Neighbor Multiple Imputation: Problems and Potential Solutions
Author(s): Rebecca Andridge* and Katherine Jenny Thompson
Companies: Ohio State University and US Census Bureau
Keywords: nearest neighbor; multiple imputation; hot deck; approximate Bayesian bootstrap
Abstract:

The U.S. Census Bureau has historically used nearest-neighbor or hot deck imputation to handle missing data for many types of establishment data. Measures such as payroll and revenue tend to be highly skewed, and using these methods removes the need to parametrically model values in imputation models. Recently these methods have been used for multiple imputation (MI), enabling variance estimation via the so-called Rubin’s Combining Rules. The Approximate Bayesian Bootstrap (ABB) is a simple-to-implement algorithm that can be used to make hot deck methods “proper” for MI. With ABB, responding units are bootstrapped before donor selection so that the set of possible donors for a nonrespondent varies across imputed datasets. In concept, ABB should work for nearest neighbor MI; bootstrapping respondents means each nonrespondent’s one “nearest” donor will not be available for every imputation. However, we show that ABB with nearest neighbor does not create a proper MI method, leading to variance underestimation. We illustrate the problem via simulation and with Economic Census data and provide guidance on alternative nearest neighbor MI methods that may be used to overcome this problem.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program