Wan2.1-T2V-1.3B-Diffusers

Wan2.1-T2V-1.3B-Diffusers

Wan-AI

A powerful 1.3B parameter text-to-video diffusion model that runs on consumer GPUs, generates high-quality 480P videos, and supports multiple languages and tasks.

PropertyValue
Parameter Count1.3B
Model TypeText-to-Video Diffusion
LicenseApache 2.0
GPU Memory Required8.19GB VRAM
Supported Resolution480P (Optimal)

What is Wan2.1-T2V-1.3B-Diffusers?

Wan2.1-T2V-1.3B-Diffusers is a groundbreaking text-to-video generation model that combines efficiency with powerful capabilities. It's designed to run on consumer-grade GPUs while delivering high-quality video outputs comparable to some closed-source solutions. The model can generate a 5-second 480P video in approximately 4 minutes on an RTX 4090.

Implementation Details

The model utilizes a Flow Matching framework within the Diffusion Transformer paradigm, featuring a dimension of 1536, 30 layers, and 12 attention heads. It employs a T5 Encoder for multilingual text processing and includes a novel spatio-temporal variational autoencoder (Wan-VAE) for efficient video processing.

  • Dimension: 1536
  • Input/Output Dimension: 16
  • Feedforward Dimension: 8960
  • Number of Layers: 30
  • Number of Heads: 12

Core Capabilities

  • Text-to-Video generation with optimal 480P resolution
  • Multilingual text generation support (Chinese and English)
  • Efficient video processing with minimal VRAM requirements
  • Support for prompt extension through Dashscope API or local models
  • Compatible with Diffusers pipeline and various inference methods

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to run on consumer GPUs while maintaining high-quality output sets it apart. It requires only 8.19GB VRAM, making it accessible to most users while delivering performance comparable to larger models.

Q: What are the recommended use cases?

The model excels in generating short-form videos from text descriptions, particularly at 480P resolution. It's ideal for creative teams needing quick video generation capabilities without extensive computational resources.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026