A Semi-Parametric Approach to Account for Complex Designs in Multiple Imputation
*Hanzhi Zhou, University of Michigan 

Keywords: complex sampling design, multiple imputation, nonparametric approach, synthetic data, combining rule

As multiple imputation (MI) becomes one of leading approaches in dealing with missing data in public health survey research, existing software packages and procedures are still limited with regard to handling complex sampling designs adequately. Researchers has demonstrated that implementation of MI based on simple random sampling (SRS) assumption would cause severe bias in estimation hence invalid inferences, especially when the design features are highly related to survey variables of interest. Current literature pivots around pure model-based method which directly models the complex design features in the formulation of the imputation model. In this paper, I propose a semi-parametric procedure as an alternative approach to incorporate complex sampling designs in MI. Specifically, we divide the imputation process into two stages: the complex feature of the survey design is fully accounted for at the first stage, which is accomplished by applying a nonparametric method to generate a series of synthetic datasets; then we perform conventional parametric MI for missing data at the second stage using readily available imputation software designed for SRS sample. A new combining rule for the point and variance estimates is derived accordingly to make valid inferences based on the two-stage procedure. Using real health survey data, I evaluated the proposed method with a simulation study and compared it with the model-based method with respect to complete data analysis. Results show that the proposed method has better confidence interval coverage and is more efficient than the model-based method.