Poster
Leveraging Offline Data in Linear Latent Contextual Bandits
Chinmaya Kausik · Kevin Tan · Ambuj Tewari
West Exhibition Hall B2-B3 #W-1018
Many real-world systems—like recommendation engines or clinical decision aids—learn better when they can combine past data with new interactions. But when past data comes from a mix of different types of users or conditions, this can confuse standard learning methods. Our work addresses this by designing algorithms that can handle these hidden differences. Specifically, we focus on settings where each user behaves differently, but these differences lie in a shared low-dimensional structure. First, we show how to use existing pre-collected data to uncover this shared structure, even when there are infinitely many user types. Then, we introduce two new learning algorithms that use this knowledge to improve decision-making with new users. One algorithm is provably optimal, and the other runs faster and is more practical. We test these methods on synthetic data and real movie recommendation data and show strong improvements. Finally, we prove that this framework of hidden differences, or latent structure, captures a larger and more general notion of reasonable models of decision-making without memory -- highlighting its generality for future applications.