Randomized clinical trials (RCTs) are the gold standard for estimating treatment effects with limited external validity. For example, although the Frequent Hemodialysis Daily Network (FHN) Trial demonstrated a positive finding, clinicians are skeptical about the generalizability of its findings to the typical end stage renal disease population. Propensity Score (PS)-based methods, such as inverse probability of selection weighting (IPSW), can be used to mitigate this issue. Missing data in covariates used in the PS estimation can threaten the validity of such methods, however. Multiple Imputation (MI) is a well-established and accessible method for handling missing data, but there is no consensus on best statistical practice for utilizing MI in this context. We conducted an extensive simulation study to evaluate properties of estimators under a variety of MI strategies that fall under two umbrellas (passive and active), coupled with two general strategies for applying IPSW (within and across). Using a real-world example of generalizing FHN findings to United States Renal Data System data, we illustrate considerable heterogeneity across methods and provide practical guidelines.