Online Program

Return to main conference page

All Times ET

Program is Subject to Change

Tuesday, June 15
Tue, Jun 15, 9:30 AM - 11:00 AM
Much Ado About Nothing: The Problem of Missing Data (And Some Ways to Handle It)

Variance estimation under nearest neighbor hot deck imputation for multinomial data: Introducing the Case Study (309744)

*Katherine Jenny Thompson, U.S. Census Bureau 

Keywords: nearest neighbor imputation, balance complex, variance estimation

Sample surveys are often designed to estimate totals (e.g. revenue). However, many surveys request sets of compositional variables (details) that sum to a total. The detail proportions can vary by sample unit, and their multinomial distributions may be related to different predictors than their total. The primarily missing data challenge is the treatment of the details, which are characterized by high missingness rates, legitimate zeros, and few alternative data sources outside of the survey available for imputation. Nearest neighbor ratio imputation (NNRI) is an appealing missing data treatment if the set of details is correlated with unit size. NNRI uses auxiliary variable(s) available for both donors and recipients to identify the “nearest” donor. Variance estimation from NNRI data not straightforward, in part because the donor selection procedure is deterministic. This session applies a single and a multiple imputation approach to the same NNRI variance estimation problem, motivated by an application to the Service Annual Survey (SAS). This presentation introduces the survey, presenting background on its design, then describes the specific imputation problem and procedure.