Flux.1-Heavy-17B

city96

A massive 17B parameter self-merge of Flux.1-dev model, requiring 35-40GB VRAM for inference. Specialized in text-to-image generation with extensive resource demands.

Property	Value
Parameter Count	17.17B
License	flux-1-dev-non-commercial-license
VRAM Requirements	35-40GB
Architecture	FluxTransformer2D (32 p-layers, 44 s-layers)

What is Flux.1-Heavy-17B?

Flux.1-Heavy-17B is an ambitious self-merge experiment of the original 12B parameter Flux.1-dev model, expanded to 17B parameters. Created by city96, this model represents a significant advancement in large-scale text-to-image generation, though it comes with substantial computational requirements. The model employs a layer interweaving technique similar to those used in 70B->120B LLM merges, making it a unique proof-of-concept in the image generation space.

Implementation Details

The model's architecture consists of 32 p-layers and 44 s-layers, totaling 17.17B parameters. It requires approximately 35-40GB of VRAM for inference, with additional system RAM requirements reaching up to 80GB on Windows systems. The model supports both Diffusers and Comfy implementations, though partial offloading is necessary for most practical applications.

Compatible with inference pipeline and custom FluxTransformer2DModel
Supports partial VRAM offloading with sufficient system RAM
Works with ostris/ai-toolkit for training purposes
Maintains BF16 precision for optimal performance

Core Capabilities

Text-to-image generation with large-scale parameter count
Limited LoRA compatibility with existing models
Coherent image generation with occasional text rendering issues
Post-merge training capabilities for performance improvement

Frequently Asked Questions

Q: What makes this model unique?

This model stands out as possibly the first open-source 17B parameter image model capable of generating coherent images, albeit as a self-merge experiment. Its massive scale and unique architecture make it a significant technical achievement, even with its practical limitations.

Q: What are the recommended use cases?

The model is primarily recommended for research purposes or technical demonstrations. Due to its extensive resource requirements and experimental nature, it's not advised for production use unless you have access to high-end hardware and specific requirements for such a large model.