3DKX_1.0b
Property | Value |
---|---|
Author | unvailai |
Dataset Size | 140-180 images |
Model Type | Latent Diffusion |
Base Architecture | UNet with Cross-Attention |
What is 3DKX_1.0b?
3DKX_1.0b is a specialized AI model designed for generating high-quality 3D renders with dual-style capabilities. It features an embedded VAE and can produce both realistic 3D renders and cartoon-style images through specific prompt engineering.
Implementation Details
The model is built on a Latent Diffusion architecture with a UNet backbone featuring 320 model channels and 8 attention heads. It utilizes spatial transformer blocks with a transformer depth of 1 and a context dimension of 768. The model includes an AutoencoderKL first stage with 4 embedding dimensions and a frozen CLIP embedder for conditioning.
- Image size: 64x64 latent space
- Attention resolutions: [4, 2, 1]
- Channel multipliers: [1, 2, 4, 4]
- Learning rate: 1.0e-04
Core Capabilities
- High-quality SFW portraits and full body poses
- Versatile landscape, cyberpunk, steampunk, and sci-fi generation
- Dual-style rendering (realistic vs cartoon)
- Different body types and ethnicities
- Limited suggestive NSFW capabilities
Frequently Asked Questions
Q: What makes this model unique?
The model's ability to switch between realistic 3D renders and cartoon styles using specific prompt prefixes ("3d render of" vs "3d cartoon of") sets it apart. It also includes an embedded VAE, eliminating the need for external VAE models.
Q: What are the recommended use cases?
The model excels at creating portraits, full-body poses, landscapes, and sci-fi artwork. It's particularly effective for creating 3D character renders and environmental scenes, with special proficiency in handling certain game characters like 2B from Nier Automata.