Poster
Geometric Resampling in Nearly Linear Time for Follow-the-Perturbed-Leader with Best-of-Both-Worlds Guarantee in Bandit Problems
Botao Chen · Jongyeong Lee · Junya Honda
West Exhibition Hall B2-B3 #W-915
Imagine you're playing a game with several slot machines, each giving different rewards. Your goal is to win as much as possible, but you don't know which machine is the best at first, and you have limited chances to find out. You need a smart strategy to balance learning about the machines and earning rewards. This classic dilemma shows up in many real-world settings like online recommendations and clinical trials.In this paper, we propose a new method that helps computers make better decisions in such uncertain situations. It speeds up the learning process by reducing the need for repeated trial and error, while still making accurate choices. Our approach works well even when the environment changes over time and is much faster than existing methods, making it more practical for real-world use.