CogVideoX-2b

CogVideoX-2b

THUDM

CogVideoX-2b is an open-source text-to-video diffusion model offering 720x480 video generation at 8fps, optimized for low VRAM usage starting from 4GB with FP16 precision.

PropertyValue
LicenseApache 2.0
PaperarXiv:2408.06072
FrameworkDiffusers
TaskText-to-Video Generation

What is CogVideoX-2b?

CogVideoX-2b is an entry-level text-to-video generation model designed for efficient video creation with minimal computational requirements. It represents the lightweight version of the CogVideoX family, capable of generating 6-second videos at 720x480 resolution with 8 frames per second.

Implementation Details

The model utilizes FP16 precision and features remarkable VRAM optimization, requiring as little as 4GB when using diffusers with optimizations enabled. It employs 3d_sincos_pos_embed positional encoding and supports various precision formats including FP16, BF16, FP32, and INT8.

  • Inference speed: ~90 seconds on A100, ~45 seconds on H100 (50 steps)
  • VRAM usage: 18GB with SAT, 4GB with diffusers (FP16)
  • Supports English prompts up to 226 tokens
  • Compatible with PytorchAO and Optimum-quanto for quantization

Core Capabilities

  • High-quality video generation from text descriptions
  • Efficient memory management with multiple optimization options
  • Support for various precision formats and quantization methods
  • Multi-GPU inference support
  • Fine-tuning capabilities with LORA and SFT options

Frequently Asked Questions

Q: What makes this model unique?

CogVideoX-2b stands out for its efficient balance between performance and resource requirements, making it accessible for users with limited computational resources while maintaining good video generation quality.

Q: What are the recommended use cases?

The model is ideal for standard text-to-video generation tasks, particularly suited for development and testing environments, content creation, and scenarios where computational resources are limited but quality video generation is still required.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026