Abstract:
|
The recent tendency of growing cost and nonresponse of traditional randomized surveys and rapid proliferation of web surveys and administrative data calls for developing a standard framework for inferences from nonrandom data samples. Approaches relying either on a propensity score model or on a predictive model of an outcome variable are overly sensitive to model assumptions. This paper proposes to: (a) supplement an initial nonrandom sample with a reference random sample, having missing detail target variables but containing core covariates shared with the nonrandom sample; (b) define imputation classes using both propensity and prediction scores, and impute target variables from the nonrandom to the random sample; and (c) use a delete-a-group version of the adjusted jackknife variance estimator, proposed by Rao, Shao (1992) for imputed data. Since imputation classes are defined by both propensity and predictive models, the proposed framework exhibits double-robust property against misspecification of either model. Reference samples, complete with imputed data and jackknife replication weights, can be released to end-users as public use files, allowing for any kind of inferences. The proposed paradigm for inferences from nonrandom samples may legitimize their use in official statistics.
|