Abstract:
|
A/B tests are standard tools for estimating the average treatment effect (ATE) in online controlled experiments (OCEs). The majority of OCE theory makes the Stable Unit Treatment Value Assumption, which presumes the response of individual users depends only on the assigned treatment, not the treatments of others. Violations of this assumption occur when users are subjected to network interference. Standard methods for estimating the ATE typically ignore this and produce heavily biased results. Additionally, user covariates that are not observed, but influence both user response and network structure, also bias current ATE estimators. This fact has so far been almost completely overlooked in the network A/B testing literature. In this paper, we demonstrate that the network-influential lurking variables can heavily bias popular network clustering-based methods, thereby making them unreliable. To address this problem, we propose a two-stage design and estimation technique called HODOR: Hold-Out Design for Online Randomized experiments. The proposed method not only outperforms existing techniques, it provides reliable estimation even when the underlying network is unknown or uncertain.
|