Abstract:
|
As a means to remain competitive, many tech companies are using online controlled experiments (A/B tests) to improve their products, services, and customer experience. Companies like Google, Netflix, Microsoft and Facebook run tens of thousands of experiments, engaging millions of users, each year. Such experiments are typically used to decide whether a particular product variant outperforms one or more alternatives. Relative to traditional DOE applications, the cost of experimental units is much lower and the ease with which data are collected is much greater. This translates into enormous sample sizes that often produce statistically significant p-values, independent of whether the difference between variants is practically meaningful. In this talk we propose the use of comparative probability metrics as an estimation-based alternative to traditional two-sample hypothesis testing. The proposed Bayesian methodology provides a flexible and intuitive means of quantifying the similarity or superiority of one variant relative to another, while accounting for the magnitude of a practically inconsequential difference. A methodology for sample size determination will also be discussed.
|