Poster
in
Workshop: DataWorld: Unifying data curation frameworks across domains
Aquilon: Towards Building Multimodal Weather LLMs
Sumanth Varambally · Veeramakali Vignesh Manivannan · Yasaman Jafari · Luyu Han · Zachary Novack · Zhirui Xia · Salva Ruhling Cachay · Srikar Eranky · Brooks(Ruijia) Niu · Taylor Berg-Kirkpatrick · Duncan Watson-Parris · Yian Ma · Rose Yu
Keywords: [ Multimodal Weather ] [ Weather Foundation Models ] [ Scientific Question Answering ] [ Scientific Reasoning ]
Recent advancements in weather foundation models—pre-trained on vast amounts of structured numerical data—have set new standards in weather forecasting accuracy. However, their lack of language-based reasoning capabilities leaves a critical opportunity untapped for human-in-the-loop analysis systems. In contrast, large language models (LLMs) excel at understanding and generating text, but they struggle with high-dimensional weather inputs like meteorological datasets. In this work, we take a significant step towards bridging this gap by enabling multimodal LLMs to reason over complex weather data. We address two fundamental challenges: (1) the absence of large-scale, multitask, multimodal datasets for weather reasoning, and (2) the lack of methods for embedding multi-channel weather data into LLM-compatible representations. To tackle these, we introduce a scalable data generation pipeline that constructs diverse question-answer pairs across a wide spectrum of weather-related tasks, from basic lookups to advanced forecasting and extreme event detection. We also leverage pretrained weather foundation models to extract low-dimensional embeddings of weather fields, enabling their integration with LLMs. Our experiments reveal that multimodal weather reasoning is a challenging problem that current models only partially address—highlighting the need for more effective weather representations and richer training data to fully unlock the potential of LLMs in meteorological applications.