Abstract:
|
Responsive survey designs implement surveys in phases, where each phase is a separate protocol with different cost and error structures. The goal is to design a series of phases such that nonresponse errors cancel each other across the phases while staying within a fixed budget. Some work has been done to identify when phases are complementary with respect to errors. However, no work has been done to evaluate costs across phases. Without accurate cost estimates, resources may be inefficiently allocated across phases. In this presentation, we compare statistical and machine learning methods of predicting costs for alternative designs. The first modeling strategy uses multi-level models to predict the number of hours of interviewer time. The second approach uses a machine learning method, Bayesian Additive Regression Trees (BART). We evaluate the predictive accuracy of the models using data from a real survey. We find that the BART modeling approach yields a useful approach to maximizing predictive accuracy, while the multi-level regression models offer an alternative with results that are relatively easy to interpret.
|