Abstract:
|
The steady decline of response rates in probability surveys parallel to the fast emergence of unstructured data has led to a growing interest in inference methods for non-probability samples. The existing adjustment methods rely heavily on models and ignore the complexity of the design in the benchmark sample. We apply two classes of doubly robust methods, augmented inverse propensity weighting and penalized spline propensity prediction, and compare through a simulation study. However, to account for the complex design of the benchmark sample, we propose generating a synthetic population by “undoing” the sample design through a finite population Bayesian bootstrap method. To further protect against model-misspecification, we employ Bayesian Additive Regression Trees, which not only captures non-linear associations and multi-way interactions automatically but also allows us to easily estimate the variance using its posterior predictive draws. Considering the Crash Investigation Sampling System 2017 as the benchmark, we apply our proposed method on the non-probability sample from Crash Injury Research and Engineering Network to make inference about some crash-related injury outcomes.
|