stable-diffusion-v-1-4-original

Maintained By
CompVis

Stable Diffusion v1.4

PropertyValue
DeveloperCompVis (Robin Rombach, Patrick Esser)
LicenseCreativeML OpenRAIL-M
Training DataLAION-aesthetics v2 5+
Research PaperHigh-Resolution Image Synthesis With Latent Diffusion Models (CVPR 2022)

What is stable-diffusion-v-1-4-original?

Stable Diffusion v1.4 is a sophisticated latent text-to-image diffusion model that represents a significant advancement in AI-powered image generation. Built upon the v1.2 checkpoint, this model underwent extensive fine-tuning with 225k additional steps at 512x512 resolution, incorporating improved classifier-free guidance sampling techniques.

Implementation Details

The model employs a latent diffusion architecture that combines an autoencoder with a diffusion model trained in latent space. It utilizes a CLIP ViT-L/14 text encoder for processing prompts and features a relative downsampling factor of 8, transforming images from HxWx3 to H/f x W/f x 4 dimensions.

  • Training Infrastructure: 32 x 8 x A100 GPUs
  • Batch Size: 2048
  • Optimizer: AdamW with 0.0001 learning rate
  • Training Data: LAION-aesthetics v2 5+ dataset

Core Capabilities

  • High-quality image generation at 512x512 resolution
  • Text-guided image synthesis
  • Artistic and creative content generation
  • Research and educational applications
  • Design and artistic workflow integration

Frequently Asked Questions

Q: What makes this model unique?

This model's uniqueness lies in its optimized training process, incorporating 10% text-conditioning dropout and extensive fine-tuning on carefully curated datasets, resulting in superior image generation quality compared to previous versions.

Q: What are the recommended use cases?

The model is primarily intended for research purposes, including safe deployment studies, bias investigation, artistic creation, educational tools, and generative model research. It specifically excludes harmful content generation and misuse cases.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.