Abstract:
|
Samplers often dislike model-based approaches to survey inference due to concerns about model misspecification. We suggest that the model-based paradigm can work very successfully in survey settings, provided the models chosen avoid strong parametric assumptions. The Horvitz-Thompson (HT) estimator is a simple design-unbiased estimator of the population total in probability sampling designs. From modeling perspective, the HT estimator performs well when the ratios of the outcome values and the inclusion probabilities are exchangeable. When this assumption is not met, the HT estimator can be very inefficient. In previous work (Zheng and Little, 2001) we showed nonparametric model-based estimators are, in general, more efficient than the HT estimator in probability proportional to size (PPS) samples. In this paper we show similar gains for two-stage sampling. We use a p-spline based additive mixed model that fits a nonparametric relationship between the primary sampling unit (PSU) means, and measures of PSU size; and use random effects to model clustering effects. Simulation studies on simulated data and on real data show the model-based method is in general more efficient than HT.
|