Abstract:
|
Efficient sampling designs such as the nested case-control design help decrease study costs and burden on participants when the event of interest is rare and the covariate of interest is difficult or expensive to measure. By collecting full covariate information on all subjects who experience an event and only a subsample of those who do not experience an event, the nested case-control design sampling scheme leads to an estimator that can be magnitudes more efficient than a simple random sample from the full cohort when the model is specified correctly. In this presentation, we show that under model misspecification, the estimand for the nested case-control design depends on the censoring distribution and on the number of controls selected, leading to potentially unreproducible results depending upon logistical design decisions. We propose a new estimator that applies frequency weights to recover the full cohort estimator and reweights the contribution of each subject to the partial likelihood based on a covariate-dependent inverse probability of censoring estimate. Theoretical and empirical results are presented to illustrate the utility of the proposed estimator.
|