e5-large

e5-large

intfloat

E5-large is a 335M parameter English text embedding model trained via weakly-supervised contrastive learning, optimized for semantic similarity and retrieval tasks.

PropertyValue
Parameter Count335M
Architecture24-layer Transformer with 1024d embeddings
LicenseMIT
PaperText Embeddings by Weakly-Supervised Contrastive Pre-training

What is e5-large?

E5-large is a powerful text embedding model designed for semantic search and similarity tasks. Developed through weakly-supervised contrastive pre-training, it generates high-quality 1024-dimensional embeddings for English text. The model requires specific prefix formatting ("query:" or "passage:") and can process sequences up to 512 tokens in length.

Implementation Details

The model utilizes a 24-layer Transformer architecture and implements contrastive learning with a low temperature of 0.01 for the InfoNCE loss. It supports both PyTorch and Sentence-Transformers frameworks, making it versatile for different application scenarios.

  • Optimized for both symmetric (semantic similarity) and asymmetric (retrieval) tasks
  • Supports batch processing with automatic padding and truncation
  • Implements efficient average pooling for embedding generation
  • Achieves strong performance on BEIR and MTEB benchmarks

Core Capabilities

  • Text Retrieval and Semantic Search
  • Semantic Similarity Assessment
  • Classification and Clustering
  • Passage Ranking and Reranking
  • Cross-document Similarity Analysis

Frequently Asked Questions

Q: What makes this model unique?

E5-large's distinctive feature is its weakly-supervised contrastive pre-training approach, which enables strong performance across various text similarity tasks while maintaining efficient inference times. The model's careful handling of query and passage prefixes ensures optimal performance for different use cases.

Q: What are the recommended use cases?

The model excels in information retrieval, semantic search, and text similarity tasks. Use "query:" prefix for symmetric tasks like semantic similarity and "query:"/"passage:" prefixes for asymmetric tasks like passage retrieval. It's particularly effective for applications requiring high-quality text embeddings for search or classification.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026