Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Scaling Up Intervention Models

Learning to Adapt: Self-Supervised Representations for Robust Contextual Bandits

Janos Horvath


Abstract:

We propose a new self-supervised domain adaptation framework for contextual bandits, addressing both abrupt and gradual environment shifts. Our method pretrains a compact representation on unlabeled data, then integrates it into both classical (e.g., LinUCB, TS) and neural bandit algorithms. Empirically, we show that our approach dramatically reduces regret and speeds adaptation across eight distinct domains, outperforming standard non-adaptive baselines and simpler autoencoder methods in final performance.

Chat is not available.