flux-mini

TencentARC

Flux-mini is a 3.2B parameter efficient text-to-image model distilled from the larger 12B Flux-dev, optimized for consumer devices while maintaining strong generation capabilities.

Property	Value
Model Size	3.2B parameters
License	Flux-1-dev-non-commercial-license
Developer	TencentARC
Type	Text-to-Image Generation

What is flux-mini?

Flux-mini is a compact and efficient text-to-image generation model that represents a significant advancement in making AI image generation more accessible. Developed by TencentARC, it's a distilled version of the larger 12B Flux-dev model, reduced to just 3.2B parameters while maintaining strong generation capabilities. This optimization makes it particularly suitable for consumer-level devices where computational resources are limited.

Implementation Details

The model employs a sophisticated distillation process that reduces the original architecture from 19 double blocks and 38 single blocks to just 5 double blocks and 10 single blocks. The distillation process involves three key objectives: denoise loss, output alignment loss, and feature alignment loss. Training was conducted in two stages: first with 512x512 Laion images recaptioned with Qwen-VL for 90k steps, followed by 1024x1024 images generated using JourneyDB prompts for another 90k steps.

Efficient architecture reduction while preserving generation quality
Multi-objective distillation process
Two-stage training methodology with high-quality datasets
Feature alignment matching between student and teacher models

Core Capabilities

Generation of human and animal faces
Creation of landscape and fantasy scenes
Production of abstract artistic compositions
Support for high-resolution image generation
Optimized for specific prompt formats similar to JourneyDB

Frequently Asked Questions

Q: What makes this model unique?

Flux-mini's uniqueness lies in its successful compression of a larger model while maintaining generation quality, making it one of the few efficient text-to-image models suitable for consumer devices. The innovative distillation process and careful block selection contribute to its effectiveness despite the smaller size.

Q: What are the recommended use cases?

The model excels at generating common images including portraits, landscapes, and fantasy scenes. It's particularly effective when used with descriptive prompts that follow the JourneyDB format, combining nouns and adjectives with artistic style references. However, users should be aware of limitations in generating fine-grained details, text, and complex geometric structures.