Poster
in
Workshop: 1st Workshop on Foundation Models for Structured Data (FMSD)
Simulation-Pretrained Foundation Models for Domain-General Astronomical Time Series Tasks with Minimal Labeled Data
Rithwik Gupta · Daniel Muthukrishna
Astronomical time series analysis faces a critical limitation: the scarcity of labeled real data. We present a foundation model approach that leverages physics-informed simulations as pretraining data, significantly reducing the need for labeled examples. Our models, trained on simulated data from multiple telescopes, learn generalizable representations that transfer effectively to downstream tasks. Using classifier-based architectures enhanced with novel contrastive and adversarial objectives, we create domain-agnostic models that recognize similar astronomical phenomena across different instrumental contexts. These models demonstrate substantial performance improvements over previous methods on classification, redshift estimation, and anomaly detection tasks when fine-tuned with minimal real data. Remarkably, our models exhibit effective zero-shot transfer capabilities, achieving comparable performance on future telescope (LSST) simulations when trained solely on existing telescope (ZTF) data. Furthermore, they generalize to entirely different astronomical phenomena (Kepler periodic variables) despite being trained on transient events, demonstrating cross-domain capabilities similar to large language models. Our approach provides a practical solution for building robust time series foundation models when labeled data is scarce but domain knowledge can be encoded in simulations.