Wan2.1-T2V-1.3B-nf4
Property | Value |
---|---|
Author | sarthak247 |
Model Type | Text-to-Video Diffusion Model |
Original Size | ~6GB |
Optimized Size | ~1GB |
Repository | Hugging Face |
What is Wan2.1-T2V-1.3B-nf4?
Wan2.1-T2V-1.3B-nf4 is an optimized version of the original Wan2.1-T2V-1.3B model, specifically designed to run on GPUs with limited VRAM. The model employs NF4 quantization techniques to dramatically reduce memory requirements while maintaining functionality.
Implementation Details
The model implements several key optimizations to achieve lower VRAM usage:
- Diffusion Model: All Linear layers converted from float32 to nf4, reducing model size from 6GB to approximately 1GB
- VAE: Maintained original implementation as it contains no Linear layers to quantize
- T5-UMT Encoder: Currently being optimized for lower VRAM usage
Core Capabilities
- Efficient text-to-video generation on consumer-grade GPUs
- Significantly reduced memory footprint compared to original model
- Compatible with 8GB VRAM GPUs like the RTX 4060
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its efficient implementation that makes text-to-video generation accessible on consumer-grade GPUs through advanced quantization techniques, specifically targeting systems with limited VRAM.
Q: What are the recommended use cases?
The model is ideal for users who want to run text-to-video generation on consumer-grade GPUs with limited VRAM (8GB or less), particularly those using cards like the RTX 4060.