Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 1st Workshop on Foundation Models for Structured Data (FMSD)

Filter, Augment, Forecast: Online Data Selection for Robust Time Series Forecasting

Ege Onur Taga · Halil Alperen Gozeten · Kutay Tire · Rahul Dalvi · Reinhard Heckel · Samet Oymak


Abstract:

While significant effort has been devoted to developing deep learning architectures for time series forecasting, the role of data in the training pipeline remains relatively overlooked. In this work, we propose Filter, Augment, Forecast (FAF): an online data curation strategy based on (1) data selection to filter out low-quality (e.g., noisy) examples and (2) augmentation of the remaining high-quality data. We use reference model-based filtering inspired by the reducible holdout loss selection (RHO-LOSS) from the language modeling literature. We identify limitations of RHO-LOSS under domain shifts common in time series and introduce the adaptive RHO method (AdaRho), which improves performance by updating the reference model during training. We provide a theoretical analysis using random matrix theory, highlighting the impact of reference models and noise on data selection. FAF improves forecasting accuracy across diverse architectures without altering them, achieving a 5.6% median MSE and 3.2% median MAE reduction on nine datasets.

Chat is not available.