Online Program

Return to main conference page

All Times ET

Friday, June 4
Computational Statistics
New Models and Methods
Fri, Jun 4, 1:20 PM - 2:55 PM
TBD
 

Machine Learning Assisted Complex Survey Weights (309826)

*Stanislav Kolenikov, Abt Associates 

Keywords: complex survey

We propose a workflow to create complex survey, nonresponse adjusted and calibrated weights that utilize machine learning methods (1) to support estimation of response propensities, and (2) to augment the population control totals for the weight calibration step with outcome propensities. In a “traditional” workflow, the sampling statistician would fit a logistic regression of the unit response indicator on the main effects of the frame/baseline variables, and calibrate the weights to the population characteristics and/or the frame/baseline demographic variables. We replace the nonresponse adjustment step with a more flexible estimation of response propensities using machine learning methods. Furthermore, we extend the model-calibration ideas to use nonlinear models, and use the predictions from the ML models for the primary outcomes of the survey to create synthetic variables whose control totals can be used for weight calibration. Improvements are found in both the accuracy of the nonresponse adjustments, and in calibration to the augmented variables. The process is demonstrated with a list sample of participants in a training program where rich baseline data provide the input features for the machine learning training models.