Keywords: generalizability, sample selection model, small sample, target population, simulation, weighting, machine learning
Generalizing a sample average treatment effect (SATE) to a target population requires fitting a model of sample selection similar in spirit to treatment assignment models used for propensity scores. When the size of the experimental sample is small and the target population is large, the estimated sample selection model, and thus the estimated target average treatment effect (TATE), may be susceptible to biases due to class imbalance and the small sample size. Various parametric and nonparametric methods including logistic regression, generalized boosting, and Bayesian Additive Regression Trees are available to address these issues. We compare their performance in generalizing the SATE to the TATE in complex simulated data according to bias, mean squared error, and 95% confidence interval coverage rate. We vary the proportion of the population sampled, the total population sample size, and the complexity of the selection model.