Abstract:
|
In order to derive unbiased inference from observational data, matching methods are often applied to produce balanced treatment groups in terms of relevant background variables. Although many matching algorithms exit in the literature, most require a large control reservoir and can not deal with missing covariates. Random forest, averaging outcomes from many decision trees, is nonparametric in nature, can deal with missing data in the tree building process, and can produce more accurate and less model dependent estimates of propensity scores as well as a proximity matrix. In this study, iterative matching algorithms are developed in order to form balanced samples based on limited sample sizes for both groups. In addition, the issue of how to evaluate sample balance in the presence of missing data is also investigated. The proposed methods are applied to two data sets, arising from studies of autism spectrum disorder (ASD) and student success.
|