Prithvi-EO-1.0-100M

Property	Value
Authors	IBM-NASA Geospatial Team
Architecture	Temporal Vision Transformer (ViT) with MAE
Input Format	Multi-temporal satellite imagery (B,C,T,H,W)
Paper	arxiv:2310.18660

What is Prithvi-EO-1.0-100M?

Prithvi-EO-1.0-100M is a groundbreaking temporal Vision Transformer model jointly developed by IBM and NASA for Earth observation tasks. It's pre-trained on Harmonised Landsat Sentinel-2 (HLS) data covering the contiguous United States, using a Masked AutoEncoder (MAE) learning strategy. The model's unique capability lies in processing both spatial and temporal dimensions of satellite imagery.

Implementation Details

The model implements a modified ViT architecture with 3D adaptations, including spatial attention across multiple patches and temporal attention for each patch. It processes six spectral bands: Blue, Green, Red, Narrow NIR, SWIR 1, and SWIR 2, making it particularly valuable for comprehensive Earth observation analysis.

3D patch embedding replacing traditional 2D approach
3D positional embedding for spatio-temporal understanding
Modified patchify and unpatchify operations for 3D data
Support for both multi-temporal and single-image inference

Core Capabilities

Time-series analysis of satellite imagery
Burn scars segmentation
Flood mapping and detection
Land cover classification
Multi-temporal crop classification

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to process temporal sequences of satellite imagery sets it apart from traditional remote sensing models. It can analyze changes over time, making it especially valuable for monitoring environmental changes and disasters.

Q: What are the recommended use cases?

The model excels in applications requiring temporal analysis of Earth observation data, such as disaster monitoring, agricultural assessment, and land use change detection. It can be fine-tuned for specific tasks using the provided examples for flood mapping and burn scar detection.