marigold-normals-v0-1

Maintained By
prs-eth

Marigold Normals v0-1

PropertyValue
LicenseApache License 2.0
Model TypeGenerative Latent Diffusion
Resolution768px (optimal)
AuthorsBingxin Ke, et al. (PRS-ETH)

What is marigold-normals-v0-1?

Marigold Normals v0-1 is a specialized AI model designed for monocular surface normals estimation from single images. Fine-tuned from stable-diffusion-2, this model represents a significant advancement in 3D surface understanding, though it has been superseded by v1-1. The model generates detailed surface normal maps, providing crucial information about object geometry and surface orientation in images.

Implementation Details

The model operates optimally with the DDIM scheduler, requiring between 10-50 denoising steps. While it can process various image sizes, it performs best with images resized to 768 pixels on the longer side, inheriting this constraint from its base diffusion model architecture.

  • Outputs 3-dimensional unit vectors in screen space camera coordinates
  • Supports uncertainty map generation through ensemble predictions
  • Built on stable-diffusion-2 architecture

Core Capabilities

  • Single-image normal map estimation
  • Ensemble-based uncertainty quantification
  • Efficient processing with DDIM scheduler
  • Interactive demo available via Hugging Face Spaces

Frequently Asked Questions

Q: What makes this model unique?

This model uniquely repurposes diffusion-based image generators for the specific task of normal estimation, offering a novel approach to understanding 3D surface geometry from 2D images.

Q: What are the recommended use cases?

The model is ideal for tasks requiring surface normal estimation from single images, such as 3D reconstruction, augmented reality applications, and computer vision research. However, users should note that this version is deprecated and should consider using the newer v1-1 version for production use.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.