ControlNet for Waifu Diffusion 1.5

Property	Value
License	OpenRAIL
Base Model	Waifu Diffusion 1.5 Beta2
Training Type	Differenced ControlNet

What is ControlNet?

This implementation of ControlNet is specifically designed for anime-style image generation, built on top of the Waifu Diffusion 1.5 Beta2 model. It provides three distinct control mechanisms: edge detection, depth perception, and pose control, each trained on specialized datasets.

Implementation Details

The model consists of three main components, each trained with different parameters: A canny edge detector trained on 71k images across multiple epochs, a depth perception model trained on 71k images, and a pose control model initially trained on 14k Umamusume images and fine-tuned with 52k general images.

Training utilized bfloat16 amp for all components
Batch sizes of 8-16 were used across different training phases
Learning rates varied from 5e-5 to 1e-5 during training

Core Capabilities

Edge Detection: Precise control over image outlines and boundaries
Depth Perception: Accurate handling of spatial relationships
Pose Control: Specialized control over character poses, with particular strength in anime-style characters

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for anime-style image generation, with dedicated training on anime character datasets. It offers three distinct control mechanisms in a single package, making it versatile for various anime art generation tasks.

Q: What are the recommended use cases?

The model is ideal for generating anime-style images with precise control over edge details, depth perception, and character poses. It's particularly well-suited for creating character illustrations with specific pose requirements or detailed edge work.

ControlNet