Poster
in
Workshop: Scaling Up Intervention Models
Learning to Adapt: Self-Supervised Representations for Robust Contextual Bandits
Janos Horvath
Abstract:
We propose a new self-supervised domain adaptation framework for contextual bandits, addressing both abrupt and gradual environment shifts. Our method pretrains a compact representation on unlabeled data, then integrates it into both classical (e.g., LinUCB, TS) and neural bandit algorithms. Empirically, we show that our approach dramatically reduces regret and speeds adaptation across eight distinct domains, outperforming standard non-adaptive baselines and simpler autoencoder methods in final performance.
Chat is not available.