EVA-CLIP

EVA-CLIP

QuanSun

EVA-CLIP is a high-performance vision-language model series with state-of-the-art zero-shot classification capabilities, trained on large-scale datasets including LAION-400M and Merged-2B.

PropertyValue
LicenseMIT
PaperarXiv:2303.15389
AuthorQuanSun

What is EVA-CLIP?

EVA-CLIP is a series of state-of-the-art vision-language models that achieves exceptional performance in zero-shot classification tasks. The model family includes various sizes, from the efficient EVA02_CLIP_B_psz16_s8B (149M parameters) to the powerful EVA02_CLIP_E_psz14_plus_s9B (5.0B parameters).

Implementation Details

The EVA-CLIP series is trained using different precision formats (fp16 and bf16) on massive datasets including LAION-400M, LAION-2B, and a custom Merged-2B dataset. Training utilized extensive computational resources, ranging from 64 to 256 A100 GPUs depending on the model variant.

  • Multiple architecture variants available (Base, Large, and Enormous)
  • Training batch sizes ranging from 41K to 144K
  • Advanced model interpolation techniques for patch embedding and position embedding

Core Capabilities

  • State-of-the-art zero-shot classification performance on ImageNet (up to 82.0% top-1)
  • Superior MSCOCO Text-to-Image retrieval (up to 75.0% R@5)
  • Scalable architecture supporting various model sizes for different requirements
  • Efficient training through MIM teacher-student framework

Frequently Asked Questions

Q: What makes this model unique?

EVA-CLIP represents the most performant open-sourced CLIP models across all scales, particularly excelling in zero-shot classification tasks on mainstream benchmarks like ImageNet and its variants.

Q: What are the recommended use cases?

The model is particularly well-suited for zero-shot image classification, text-to-image retrieval, and general vision-language tasks. Different model sizes allow for deployment in various scenarios, from resource-constrained environments to high-performance requirements.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026