Abstract:
|
Two-phase, response-dependent sampling is often used in applications involving expensive covariate measurements. In phase 1, measurements on the response and inexpensive covariates are obtained on a sample, and in a response-dependent phase 2 sub-sample, the expensive covariate measurement is obtained. In a missing data framework, unobserved covariate values are missing-at-random (MAR). Conditional maximum likelihood (CML) is an attractive approach for MAR covariates as it avoids modeling the covariate distribution. Scott and Wild (2011) gave a semiparametric efficient estimator, referred to as the SW estimator, for regression with binary response and categorical covariates. We consider a general regression model, and show that an estimator of the same form as SW has identical efficiency to two empirical likelihood estimators, and that they dominate the CML estimator. Thus the SW estimator is appealing in more general settings, and avoids the sometimes difficult computation of empirical likelihood estimators.
|