sdxl-vae

Maintained By
stabilityai

SDXL-VAE

PropertyValue
LicenseMIT
PaperLatent Diffusion Models
Downloads177,403
FrameworkDiffusers

What is sdxl-vae?

SDXL-VAE is an enhanced variational autoencoder specifically designed for the Stable Diffusion XL model. It represents a significant improvement over the original SD VAE, trained with larger batch sizes (256 vs 9) and incorporating exponential moving average (EMA) tracking for better weight optimization.

Implementation Details

The model operates as a latent diffusion component, where the diffusion process occurs in a pretrained, learned latent space. It can be easily integrated into existing diffusers pipelines through the AutoencoderKL class.

  • Improved local and high-frequency detail generation
  • Enhanced reconstruction metrics across all evaluation parameters
  • Seamless integration with Stable Diffusion workflows

Core Capabilities

  • Superior reconstruction performance (rFID: 4.42)
  • Better PSNR scores (24.7 ±3.9)
  • Improved SSIM metrics (0.73 ±0.13)
  • Enhanced PSIM performance (0.88 ±0.27)

Frequently Asked Questions

Q: What makes this model unique?

SDXL-VAE stands out due to its significantly larger batch size during training and the implementation of EMA tracking, resulting in superior reconstruction quality compared to the original VAE used in Stable Diffusion.

Q: What are the recommended use cases?

This VAE is specifically designed for use with SDXL models and is recommended for applications requiring high-quality image generation with better local detail preservation and overall reconstruction fidelity.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.