Keywords: propensity scores, missing data, continuous exposures
Propensity score models are frequently used to estimate causal effects in observational studies. One unresolved issue in fitting these models is handling missing values in covariates. As these models usually contain a large set of covariates, using complete data analysis significantly decreases the sample size and statistical power. Several missing data imputation approaches have been proposed, including multiple imputation (MI), MI with missingness pattern (MIMP), treatment mean imputation, and single mean imputation. Generalized Boosted Modeling (GBM), which is a nonparametric approach to derive propensity scores, can incorporate missing values in the model but performance of GBM is unknown when there is missing data. Although performance of MI, MIMP, and treatment mean imputation has previously been compared, they have not been compared with single imputation or GBM. We conducted a simulation study to compare these 5 approaches and the results indicate that a single imputation is sufficient for unbiased causal effect estimates. Implications of the results for making better decisions in applications of propensity score modeling in the presence of missing data will be emphasized.