Prithvi-EO-1.0-100M
Property | Value |
---|---|
Authors | IBM-NASA Geospatial Team |
Architecture | Temporal Vision Transformer (ViT) with MAE |
Input Format | Multi-temporal satellite imagery (B,C,T,H,W) |
Paper | arxiv:2310.18660 |
What is Prithvi-EO-1.0-100M?
Prithvi-EO-1.0-100M is a groundbreaking temporal Vision Transformer model jointly developed by IBM and NASA for Earth observation tasks. It's pre-trained on Harmonised Landsat Sentinel-2 (HLS) data covering the contiguous United States, using a Masked AutoEncoder (MAE) learning strategy. The model's unique capability lies in processing both spatial and temporal dimensions of satellite imagery.
Implementation Details
The model implements a modified ViT architecture with 3D adaptations, including spatial attention across multiple patches and temporal attention for each patch. It processes six spectral bands: Blue, Green, Red, Narrow NIR, SWIR 1, and SWIR 2, making it particularly valuable for comprehensive Earth observation analysis.
- 3D patch embedding replacing traditional 2D approach
- 3D positional embedding for spatio-temporal understanding
- Modified patchify and unpatchify operations for 3D data
- Support for both multi-temporal and single-image inference
Core Capabilities
- Time-series analysis of satellite imagery
- Burn scars segmentation
- Flood mapping and detection
- Land cover classification
- Multi-temporal crop classification
Frequently Asked Questions
Q: What makes this model unique?
The model's ability to process temporal sequences of satellite imagery sets it apart from traditional remote sensing models. It can analyze changes over time, making it especially valuable for monitoring environmental changes and disasters.
Q: What are the recommended use cases?
The model excels in applications requiring temporal analysis of Earth observation data, such as disaster monitoring, agricultural assessment, and land use change detection. It can be fine-tuned for specific tasks using the provided examples for flood mapping and burn scar detection.