E5-large Text Embedding Model

Property	Value
Parameter Count	335M
Architecture	24-layer Transformer with 1024d embeddings
License	MIT
Paper	Text Embeddings by Weakly-Supervised Contrastive Pre-training

What is e5-large?

E5-large is a powerful text embedding model designed for semantic search and similarity tasks. Developed through weakly-supervised contrastive pre-training, it generates high-quality 1024-dimensional embeddings for English text. The model requires specific prefix formatting ("query:" or "passage:") and can process sequences up to 512 tokens in length.

Implementation Details

The model utilizes a 24-layer Transformer architecture and implements contrastive learning with a low temperature of 0.01 for the InfoNCE loss. It supports both PyTorch and Sentence-Transformers frameworks, making it versatile for different application scenarios.

Optimized for both symmetric (semantic similarity) and asymmetric (retrieval) tasks
Supports batch processing with automatic padding and truncation
Implements efficient average pooling for embedding generation
Achieves strong performance on BEIR and MTEB benchmarks

Core Capabilities

Text Retrieval and Semantic Search
Semantic Similarity Assessment
Classification and Clustering
Passage Ranking and Reranking
Cross-document Similarity Analysis

Frequently Asked Questions

Q: What makes this model unique?

E5-large's distinctive feature is its weakly-supervised contrastive pre-training approach, which enables strong performance across various text similarity tasks while maintaining efficient inference times. The model's careful handling of query and passage prefixes ensures optimal performance for different use cases.

Q: What are the recommended use cases?

The model excels in information retrieval, semantic search, and text similarity tasks. Use "query:" prefix for symmetric tasks like semantic similarity and "query:"/"passage:" prefixes for asymmetric tasks like passage retrieval. It's particularly effective for applications requiring high-quality text embeddings for search or classification.

e5-large