Abstract:
|
We develop a unified framework under multilevel regression and poststratification (MRP) for data integration and inference. We demonstrate the capability of MRP to handle the methodological and computational issues on big data in the combination of probability and nonprobability-based surveys. The emergence of big data provides unprecedented resources for population-based studies to address policy-related questions. However, such data may not be representative of the target population as convenience or volunteer samples, a form of nonprobability-based selection. Data integration and record linkage become research priorities for most statistical agencies. The lack of theoretical foundations under new data collection methods presents challenges to traditional design-based approaches. As a promising solution with influential applications, MRP stabilizes small area estimation and accounts for the sample selection and response mechanisms into modeling. MRP can predict outcome values for nonsampled units and propagate all sources of uncertainty. We use simulation studies to evaluate the frequentist properties and statistical validity of MRP in comparison with alternative methods.
|