Abstract:
|
Randomized controlled trials (RCTs) are the gold standard for evaluating the causal effect of a treatment; however, they often have limited sample sizes and poor generalizability. On the other hand, observational data derived from large administrative databases have massive sample sizes and better generalizability, but they are prone to unmeasured confounding bias. It is thus of considerable interest to reconcile effect estimates obtained from randomized controlled trials and observational studies investigating the same intervention, potentially harvesting the best from both realms. In this paper, we theoretically characterize the potential efficiency gain of integrating observational data into the RCT-based analysis from a minimax viewpoint. For estimation, we derive the minimax rate of convergence for the mean squared error, and propose a fully adaptive anchored thresholding estimator that attains the optimal rate up to poly-log factors. For inference, we characterize the minimax rate for the length of confidence intervals and show that adaptation to unknown confounding bias is in general impossible. We corroborate our theoretical findings using simulations and a real example.
|