sam2-hiera-large
Property | Value |
---|---|
Author | |
License | Apache 2.0 |
Paper | SAM 2: Segment Anything in Images and Videos |
Downloads | 684,530 |
What is sam2-hiera-large?
sam2-hiera-large is Facebook's advanced foundation model for promptable visual segmentation in both images and videos. It represents the latest evolution in the Segment Anything Model (SAM) series, offering enhanced capabilities for mask generation and object segmentation tasks.
Implementation Details
The model implements a sophisticated architecture supporting both image and video prediction workflows. It features CUDA acceleration with bfloat16 precision support and includes specialized predictors for both image and video processing. The implementation allows for both single-image mask generation and video sequence propagation.
- Supports both point and box-based prompting
- Implements efficient video propagation algorithms
- Utilizes torch inference mode for optimal performance
- Includes state management for video processing
Core Capabilities
- Image segmentation with prompt-based mask generation
- Video object tracking and segmentation
- Real-time mask propagation in video sequences
- Multiple prompt types support (points, boxes)
Frequently Asked Questions
Q: What makes this model unique?
This model represents a significant advancement over the original SAM, offering integrated support for both image and video segmentation tasks with improved accuracy and efficiency. It's particularly notable for its ability to handle video sequences with state propagation.
Q: What are the recommended use cases?
The model is ideal for applications requiring precise object segmentation in both images and videos, including video editing, content creation, autonomous systems, and computer vision research. It's particularly powerful for scenarios requiring interactive or automated object selection and tracking.