zeroscope_v2_XL

cerspense

High-quality video generation model capable of 1024x576 resolution, specializing in watermark-free video upscaling from zeroscope_v2_576w outputs.

Property	Value
License	CC-BY-NC-4.0
Pipeline Type	Video-to-Video
VRAM Usage	15.3GB (30 frames at 1024x576)
Training Data	9,923 clips, 29,769 tagged frames

What is zeroscope_v2_XL?

zeroscope_v2_XL is an advanced video generation model designed specifically for upscaling content created with zeroscope_v2_576w. Built on ModelScope architecture, it's trained from original weights using offset noise and specializes in producing high-quality video output at 1024x576 resolution without watermarks.

Implementation Details

The model leverages the Diffusers pipeline and requires specific implementation steps for optimal performance. It's designed to work seamlessly with the 1111 text2video extension and supports various rendering configurations.

Trained on 24-frame sequences at 1024x576 resolution
Implements DPMSolverMultistepScheduler for efficient processing
Supports CPU offloading and VAE slicing for memory optimization
Compatible with torch float16 for accelerated processing

Core Capabilities

High-resolution video generation (1024x576)
Watermark-free output
Efficient upscaling from lower resolution inputs
Supports batch processing of multiple frames
Optimized for 24+ frame sequences

Frequently Asked Questions

Q: What makes this model unique?

The model's specialization in upscaling content from zeroscope_v2_576w makes it unique, allowing for superior compositions at higher resolutions while maintaining quality and efficiency.

Q: What are the recommended use cases?

The model is best suited for upscaling pre-generated videos from zeroscope_v2_576w, with recommended denoise strength between 0.66 and 0.85, and maintaining the same prompt used in the original generation.