16ch-vae

16ch-vae

AuraDiffusion

16ch-VAE is an open-source VAE reproduction for SD3, offering high-quality image encoding with 31.5151 PSNR and improved performance over SD1.5/SDXL VAEs.

PropertyValue
LicenseCreative Commons
FrameworkDiffusers
PaperSD3 Paper
PSNR Score31.5151

What is 16ch-vae?

16ch-VAE is a fully open-source Variational Autoencoder designed as a reproduction of the SD3 architecture. It's specifically engineered for image encoding and decoding tasks, trained natively in fp16 precision. This VAE stands out for its impressive performance metrics, notably achieving a PSNR of 31.5151, surpassing both SD1.5 and SDXL VAEs.

Implementation Details

The model implements a 16-channel architecture, specifically designed for high-quality image encoding. It's built using the Diffusers library framework and has been optimized for both performance and quality.

  • Native FP16 training implementation
  • Improved PSNR metrics compared to previous SD VAEs
  • Optimized for general image generation tasks
  • Compatible with the Diffusers library

Core Capabilities

  • High-fidelity image encoding with PSNR of 31.5151
  • Lower reconstruction loss compared to SD1.5/SDXL VAEs
  • Efficient 16-channel architecture
  • Support for both standard and FFT implementations

Frequently Asked Questions

Q: What makes this model unique?

This model achieves superior PSNR scores (31.5151) compared to previous SD VAEs while maintaining competitive LPIPS metrics. It's fully open-source and specifically designed for general-purpose image generation tasks.

Q: What are the recommended use cases?

The model is ideal for researchers and developers building their own image generation models who need a high-quality, off-the-shelf VAE. However, it's important to note that it's not intended as a direct replacement for SD3's VAE due to different latent spaces.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026