colsmolvlm-alpha

colsmolvlm-alpha

vidore

Visual language model for efficient document retrieval, combining SmolVLM with ColBERT strategy for multi-vector text/image representations.

PropertyValue
Base Modelvidore/ColSmolVLM-base
LicenseApache 2.0 (backbone) / MIT (adapters)
PaperColPali: Efficient Document Retrieval with Vision Language Models
Training Data127,460 query-page pairs

What is colsmolvlm-alpha?

ColSmolVLM-alpha is an innovative visual language model designed for efficient document retrieval. It extends SmolVLM by incorporating ColBERT-style multi-vector representations for both text and images. This version was trained with a batch size of 128 for 3 epochs, utilizing the PEFT (Parameter Efficient Fine-Tuning) approach with LoRA adapters.

Implementation Details

The model employs bfloat16 format and uses LoRA with alpha=32 and r=32 on transformer layers. It's trained using a paged_adamw_8bit optimizer on a 4 GPU setup with data parallelism, featuring a learning rate of 5e-4 with linear decay and 2.5% warmup steps.

  • Trained on 127,460 query-page pairs (63% academic datasets, 37% synthetic data)
  • Uses flash attention 2 for efficient processing
  • Implements ColBERT late interaction mechanism

Core Capabilities

  • Multi-vector representation generation for both text and images
  • Efficient document indexing from visual features
  • Zero-shot generalization to non-English languages
  • PDF document processing and retrieval

Frequently Asked Questions

Q: What makes this model unique?

The model combines the efficiency of SmolVLM with ColBERT's multi-vector representation strategy, enabling more nuanced document retrieval capabilities while maintaining computational efficiency.

Q: What are the recommended use cases?

The model is particularly suited for PDF document retrieval tasks, especially in academic and professional contexts where precise document matching is crucial. It's designed to handle both text and visual elements effectively.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026