Skip to yearly menu bar Skip to main content


Poster
in
Workshop: DataWorld: Unifying data curation frameworks across domains

Aquilon: Towards Building Multimodal Weather LLMs

Sumanth Varambally · Veeramakali Vignesh Manivannan · Yasaman Jafari · Luyu Han · Zachary Novack · Zhirui Xia · Salva Ruhling Cachay · Srikar Eranky · Brooks(Ruijia) Niu · Taylor Berg-Kirkpatrick · Duncan Watson-Parris · Yian Ma · Rose Yu

Keywords: [ Multimodal Weather ] [ Weather Foundation Models ] [ Scientific Question Answering ] [ Scientific Reasoning ]


Abstract:

Recent advancements in weather foundation models—pre-trained on vast amounts of structured numerical data—have set new standards in weather forecasting accuracy. However, their lack of language-based reasoning capabilities leaves a critical opportunity untapped for human-in-the-loop analysis systems. In contrast, large language models (LLMs) excel at understanding and generating text, but they struggle with high-dimensional weather inputs like meteorological datasets. In this work, we take a significant step towards bridging this gap by enabling multimodal LLMs to reason over complex weather data. We address two fundamental challenges: (1) the absence of large-scale, multitask, multimodal datasets for weather reasoning, and (2) the lack of methods for embedding multi-channel weather data into LLM-compatible representations. To tackle these, we introduce a scalable data generation pipeline that constructs diverse question-answer pairs across a wide spectrum of weather-related tasks, from basic lookups to advanced forecasting and extreme event detection. We also leverage pretrained weather foundation models to extract low-dimensional embeddings of weather fields, enabling their integration with LLMs. Our experiments reveal that multimodal weather reasoning is a challenging problem that current models only partially address—highlighting the need for more effective weather representations and richer training data to fully unlock the potential of LLMs in meteorological applications.

Chat is not available.