OpenLRM-mix-base-1.1

Property	Value
Parameter Count	260M
License	CC-BY-NC-4.0
Research Paper	LRM Paper
Training Data	Objaverse + MVImgNet
Input Resolution	336x336

What is openlrm-mix-base-1.1?

OpenLRM-mix-base-1.1 is an open-source implementation of the LRM (Learning to Reconstruct 3D Models) architecture, specifically designed for image-to-3D generation tasks. This base variant represents a balanced approach between computational efficiency and performance, featuring 12 transformer layers with 768-dimensional features and 12 attention heads.

Implementation Details

The model utilizes a sophisticated architecture combining DINOv2 image encoding with triplane decoding. It processes images at 336x336 resolution and employs 96 ray samples for rendering at 288 resolution with 96-sized patches.

12 transformer layers with 768-dimensional feature space
48-dimensional triplane representation
DINOv2 vision transformer (ViT-B/14) with register tokens as image encoder
4-layer triplane decoder implementation
Random background colors during training for improved generalization

Core Capabilities

High-quality image-to-3D object reconstruction
Efficient processing of medium-resolution images
Balanced performance for research applications
Support for diverse 3D object generation tasks

Frequently Asked Questions

Q: What makes this model unique?

This model represents a balanced implementation of the LRM architecture, featuring modifications from the original paper such as the use of DINOv2 encoding and random background colors during training. It's specifically optimized for research applications while maintaining reasonable computational requirements.

Q: What are the recommended use cases?

The model is best suited for research applications in 3D object reconstruction from single images. It's particularly effective for academic and non-commercial use cases, as specified by its CC-BY-NC-4.0 license.