ICML Filter, Augment, Forecast: Online Data Selection for Robust Time Series Forecasting

Poster
in
Workshop: DataWorld: Unifying data curation frameworks across domains

Filter, Augment, Forecast: Online Data Selection for Robust Time Series Forecasting

Ege Onur Taga · Halil Alperen Gozeten · Kutay Tire · Rahul Dalvi · Reinhard Heckel · Samet Oymak

Keywords: [ regression analysis ] [ data augmentation ] [ time series forecasting ] [ data selection ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

While significant effort has been devoted to developing deep learning architectures for time series forecasting, the role of data in the training pipeline remains relatively overlooked. In this work, we propose Filter, Augment, Forecast (FAF): an online data curation strategy based on (1) data selection to filter out low-quality (e.g., noisy) examples and (2) augmentation of the remaining high-quality data. We use reference model-based filtering inspired by the reducible holdout loss selection (RHO-LOSS) from the language modeling literature. We identify limitations of RHO-LOSS under domain shifts common in time series and introduce the adaptive RHO method (AdaRho), which improves performance by updating the reference model during training. We provide a theoretical analysis using random matrix theory, highlighting the impact of reference models and noise on data selection. FAF improves forecasting accuracy across diverse architectures without altering them, achieving a 5.6% median MSE and 3.2% median MAE reduction on nine datasets.

Chat is not available.

Poster in Workshop: DataWorld: Unifying data curation frameworks across domains

Filter, Augment, Forecast: Online Data Selection for Robust Time Series Forecasting

Ege Onur Taga · Halil Alperen Gozeten · Kutay Tire · Rahul Dalvi · Reinhard Heckel · Samet Oymak

Poster
in
Workshop: DataWorld: Unifying data curation frameworks across domains