sd-controlnet-canny

Maintained By
lllyasviel

SD-ControlNet-Canny

PropertyValue
Base ModelStable Diffusion v1-5
LicenseOpenRAIL
Training Data3M edge-image pairs
Training Duration600 GPU-hours on A100
PaperAdding Conditional Control to Text-to-Image Diffusion Models

What is sd-controlnet-canny?

SD-ControlNet-Canny is a specialized neural network architecture designed to enhance control over Stable Diffusion image generation through canny edge detection. Developed by Lvmin Zhang and Maneesh Agrawala, this model enables precise control over image generation by conditioning it on edge maps, allowing users to specify the structural composition of the generated images.

Implementation Details

The model integrates with Stable Diffusion v1-5 and processes input images through canny edge detection, creating monochrome images with white edges on black backgrounds. These edge maps serve as conditional inputs to guide the image generation process. The model was trained on a massive dataset of 3 million edge-image pairs, utilizing 600 GPU-hours on Nvidia A100 80G hardware.

  • Seamless integration with Stable Diffusion pipeline
  • Support for custom edge detection thresholds
  • Efficient memory handling with xformers support
  • Compatible with UniPC scheduler for improved inference

Core Capabilities

  • Edge-guided image generation
  • Precise structural control over output images
  • Support for both high and low threshold edge detection
  • Flexible integration with existing Stable Diffusion workflows

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its ability to precisely control image generation through edge detection, allowing users to specify exact structural elements in their generated images. The extensive training on 3M edge-image pairs makes it particularly robust and reliable.

Q: What are the recommended use cases?

The model is ideal for applications requiring precise control over image structure, such as architectural visualization, character design, and artistic recreation with structural guidance. It's particularly useful when you need to maintain specific edge-defined shapes in your generated images.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.