Prithvi-EO-2.0-300M

Property	Value
Parameters	300 Million
Architecture	Modified ViT with 3D embeddings
Developer	IBM, NASA, and Jülich Supercomputing Centre
Paper	arXiv:2412.02732
Training Data	NASA's HLS V2 product (4.2M samples)

What is Prithvi-EO-2.0-300M?

Prithvi-EO-2.0-300M is a state-of-the-art Earth observation foundation model that represents a significant advancement in processing satellite imagery. Built through a collaboration between IBM, NASA, and Jülich Supercomputing Centre, this model incorporates innovative 3D spatiotemporal capabilities and sophisticated handling of geolocation data.

Implementation Details

The model is based on a modified Vision Transformer (ViT) architecture with two major innovations: 3D patch embeddings for spatiotemporal processing and integrated geolocation awareness. It processes six spectral bands: Blue, Green, Red, Narrow NIR, SWIR, and SWIR 2, with 30m granularity.

3D convolutional layer for non-overlapping cube processing
Combined temporal and spatial positional encodings
Geolocation and acquisition date integration
Adaptive metadata handling with drop mechanism

Core Capabilities

Processing multi-temporal satellite imagery
Handling missing temporal and location data
Supporting various spatial resolutions (0.1m to 15m)
Enabling downstream tasks like crop segmentation and landslide detection

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its ability to process both spatial and temporal aspects of satellite imagery while incorporating geolocation data. The innovative drop mechanism during training allows it to function effectively even when metadata is incomplete.

Q: What are the recommended use cases?

The model is ideal for Earth observation tasks including crop segmentation, landslide detection, and carbon flux prediction. It's particularly useful for applications requiring temporal analysis of satellite imagery.