Poster
Strategic A/B testing via Maximum Probability-driven Two-armed Bandit
Yu Zhang · Shanshan Zhao · Bokui Wan · Jinjuan Wang · Xiaodong Yan
East Exhibition Hall A-B #E-1907
Detecting a minor average treatment effect is a major challenge in large-scale applications, where even minimal improvements can have a significant economic impact. Traditional methods, reliant on normal distribution-based or expanded statistics, often fail to identify such minor effects because of their inability to handle small discrepancies with sufficient sensitivity. This work leverages a counterfactual outcome framework and proposes a maximum probability-driven two-armed bandit (TAB) process by weighting the mean volatility statistic, which controls Type I error. The implementation of permutation methods further enhances the robustness and efficacy. The established strategic central limit theorem (SCLT) demonstrates that our approach yields a more concentrated distribution under the null hypothesis and a less concentrated one under the alternative hypothesis, greatly improving statistical power. The experimental results indicate a significant improvement in the A/B testing, highlighting the potential to reduce experimental costs while maintaining high statistical power.
In industry, comparing the effectiveness of two experimental strategies—such as evaluating which of two advertising policies yields higher revenue—is both common and challenging. We propose a novel test statistic based on the two-armed bandit framework, rather than relying on the Central Limit Theorem. This statistic not only offers improved statistical power but also effectively controls the Type I error rate. Our work has important implications for enhancing the sensitivity of A/B testing, reducing their duration, and lowering experimental costs.