sam2-hiera-large

Property	Value
Author	Facebook
License	Apache 2.0
Paper	SAM 2: Segment Anything in Images and Videos
Downloads	684,530

What is sam2-hiera-large?

sam2-hiera-large is Facebook's advanced foundation model for promptable visual segmentation in both images and videos. It represents the latest evolution in the Segment Anything Model (SAM) series, offering enhanced capabilities for mask generation and object segmentation tasks.

Implementation Details

The model implements a sophisticated architecture supporting both image and video prediction workflows. It features CUDA acceleration with bfloat16 precision support and includes specialized predictors for both image and video processing. The implementation allows for both single-image mask generation and video sequence propagation.

Supports both point and box-based prompting
Implements efficient video propagation algorithms
Utilizes torch inference mode for optimal performance
Includes state management for video processing

Core Capabilities

Image segmentation with prompt-based mask generation
Video object tracking and segmentation
Real-time mask propagation in video sequences
Multiple prompt types support (points, boxes)

Frequently Asked Questions

Q: What makes this model unique?

This model represents a significant advancement over the original SAM, offering integrated support for both image and video segmentation tasks with improved accuracy and efficiency. It's particularly notable for its ability to handle video sequences with state propagation.

Q: What are the recommended use cases?

The model is ideal for applications requiring precise object segmentation in both images and videos, including video editing, content creation, autonomous systems, and computer vision research. It's particularly powerful for scenarios requiring interactive or automated object selection and tracking.

sam2-hiera-large

sam2-hiera-large

What is sam2-hiera-large?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models