Spotlight
in
Workshop: TerraBytes: Towards global datasets and models for Earth Observation
Galileo: Learning Global & Local Features of Many Remote Sensing Modalities
Gabriel Tseng · Anthony Fuller · Marlena Reil · Henry Herzog · Patrick Beukema · Favyen Bastani · James Green · Evan Shelhamer · Hannah Kerner · David Rolnick
Sat 19 Jul 9 a.m. PDT — 5:30 p.m. PDT
We introduce a highly multimodal transformer that analyzes many remote sensing modalities --- multispectral optical, synthetic aperture radar, elevation maps, weather, pseudo-labels, and more --- across space and time. These inputs are useful for diverse remote sensing tasks, e.g., crop mapping, flood detection, etc. However, learning representations of remote sensing data is challenging; e.g., objects of interest vary massively in scale, from small vessels (1-2 pixels and transient) to glaciers (thousands of pixels and persistent). We present a novel self-supervised learning algorithm that extracts multi-scale features through masked modeling. Our two-task approach consists of global and local training objectives that differ w.r.t. prediction targets (deep vs. shallow) and masking strategies (structured vs. not).With a single pretrained encoder, our Galileo model outperforms SoTA models for satellite images and pixel-time series --- extensively evaluated over eleven benchmarks spanning multiple task types.