Abstract:
|
With the widespread availability of Big Data, concerns are raised over finite population inference for such large-scale non-probability samples. The existing adjustment methods rely heavily on the correct specification of the underlying models. In the presence of a relevant benchmark survey, one might consider a doubly robust estimator for the desired population quantity to protect against model misspecification. This method reconciles the idea of propensity weighting with that of prediction modeling in a way that estimates are consistent if either model holds. To further weaken the modeling assumptions, we propose a modified augmented inverse propensity weighting method that allows for more flexible non-parametric methods for prediction. In particular, we employ Bayesian additive regression trees which not only automatically capture non-linear associations, but also permit direct estimation of variance through the posterior predictive draws. Considering the National Household Travel Survey 2017 as a benchmark, we apply our method to the sensor-based naturalistic driving data from the second Strategic Highway Research Program.
|