DUSt3R_ViTLarge_BaseDecoder_512_dpt

DUSt3R_ViTLarge_BaseDecoder_512_dpt

naver

DUSt3R is a geometric 3D vision model with 571M params, using ViT-Large encoder and ViT-Base decoder for image-to-3D tasks, developed by NAVER.

PropertyValue
Parameter Count571M
Model TypeImage-to-3D
ArchitectureViT-Large encoder with ViT-Base decoder
LicenseCC BY-NC-SA 4.0
PaperarXiv:2312.14132

What is DUSt3R_ViTLarge_BaseDecoder_512_dpt?

DUSt3R is a state-of-the-art geometric 3D vision model developed by NAVER Labs. This specific variant uses a ViT-Large encoder combined with a ViT-Base decoder, optimized for 512px resolution inputs with DPT (Dense Prediction Transformer) architecture.

Implementation Details

The model operates on multiple training resolutions (512x384, 512x336, 512x288, 512x256, 512x160) and employs an asymmetric architecture through the AsymmetricCroCo3DStereo implementation. It utilizes PyTorch and supports F32 tensor operations.

  • Advanced DPT head architecture for dense predictions
  • Hybrid architecture combining ViT-Large encoder with ViT-Base decoder
  • Multi-resolution training support
  • PyTorch-based implementation with safetensors support

Core Capabilities

  • High-quality 3D geometric vision processing
  • Efficient handling of various input resolutions
  • Dense prediction capabilities through DPT architecture
  • Optimized for both accuracy and computational efficiency

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its asymmetric architecture combining ViT-Large encoder with ViT-Base decoder, optimized for 512px resolution, making it particularly effective for geometric 3D vision tasks while maintaining computational efficiency.

Q: What are the recommended use cases?

The model is ideal for applications requiring geometric 3D vision processing, including 3D reconstruction, depth estimation, and stereo vision tasks. It's particularly well-suited for scenarios requiring high-resolution input processing up to 512px.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026