Abstract:
|
On-line experimentation (also known as A/B testing) has become an integral part of software development. In order to timely incorporate user feedback and continuously improve products, many software companies have adopted the culture of agile deployment, requiring A/B tests to be conducted on limited sets of users for a short time period. While conceptually efficient and reasonable, this practice can potentially jeopardize external validity, which is critical for accurate data-driven impact evaluation and decision making. To address this concern, we study external biases for various scenarios in A/B testing, aiming to measure and correct said bias via jackknife re-sampling. In particular, we first highlight that the external bias can be mainly attributed to limited lengths of A/B tests, and then prove that our proposed jackknife estimator could efficiently adjust the first order external bias. We demonstrate the advantages of our proposed methodology by simulated and real-life examples.
|