stable-diffusion-x4-upscaler

Maintained By
stabilityai

Stable Diffusion x4 Upscaler

PropertyValue
LicenseOpenRAIL++
Training Data10M subset of LAION (>2048x2048)
PaperLatent Upscaling Diffusion Model
AuthorsRobin Rombach, Patrick Esser

What is stable-diffusion-x4-upscaler?

The Stable Diffusion x4 Upscaler is a specialized image enhancement model designed to increase image resolution by 4x while maintaining quality and following text prompts. Trained for 1.25M steps on high-resolution images, it combines the power of latent diffusion with controlled upscaling capabilities.

Implementation Details

The model operates on 512x512 crops and implements a text-guided latent upscaling diffusion architecture. It features a unique noise level parameter that allows fine control over the upscaling process, following a predefined diffusion schedule.

  • Trained on images larger than 2048x2048 pixels
  • Uses OpenCLIP-ViT/H text encoder
  • Implements v-objective optimization
  • Supports custom noise level inputs

Core Capabilities

  • 4x resolution upscaling of images
  • Text-guided enhancement control
  • Noise level adjustment for fine-tuning results
  • Efficient processing through latent space operations

Frequently Asked Questions

Q: What makes this model unique?

This model uniquely combines text-guided generation with upscaling, allowing users to control the enhancement process through natural language while maintaining high fidelity to the original image. The adjustable noise level provides additional creative control.

Q: What are the recommended use cases?

The model is ideal for enhancing low-resolution images, particularly in research, artistic, and educational contexts. It's specifically designed for upscaling images while maintaining quality and allowing creative control through text prompts.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.