Online Program

Gaussian-based routines for imputing categorical variables in complex designs

*Recai M Yucel, State University of New York at Albany 

Keywords: nominal data imputation, missing data, multiple imputation, missing data software, rounding

Among many potential complexities, two common problems encountered in health surveys or administrative databases used to inform health policy are complexity of the design or data structure and missing data. This work modifies widely-used inferential tools to derive inferences via multiple imputation (MI) to be applied in such settings. The underlying methodology is widely-accepted and is based on computational techniques to sample imputations from a proposed imputation model with Gaussian errors. These models are flexible enough to take into account of complexities due to clustering. This work proposes rounding rules to be used with these existing MVN-based imputation methods, allowing practitioners to obtain usable imputation with small biases. These rules are calibrated in the sense that values re-imputed for observed data have distributions similar to those of the observed data. The methodology is demonstrated using a sample data from the NewYork Cancer Registry database.