bge-large-en

bge-large-en

BAAI

BGE Large English embedding model (335M params) optimized for semantic search and text similarity, achieving SOTA performance on MTEB benchmark

PropertyValue
Parameter Count335M
LicenseMIT
PaperC-Pack: Packaged Resources To Advance General Chinese Embedding
FrameworkPyTorch with Transformers

What is bge-large-en?

BGE-large-en is a state-of-the-art text embedding model developed by BAAI that maps text to dense vector representations. It achieves top performance on the MTEB benchmark, making it particularly effective for semantic search, similarity comparison, and retrieval tasks.

Implementation Details

The model utilizes a transformer-based architecture with 335M parameters and generates 1024-dimensional embeddings. It's trained using contrastive learning on large-scale paired data and supports sequence lengths up to 512 tokens.

  • Optimized for both retrieval and semantic similarity tasks
  • Supports efficient batched processing with FP16 computation
  • Provides specialized query instruction handling for improved retrieval performance

Core Capabilities

  • High-performance text embedding generation for retrieval tasks
  • Excellent performance on classification, clustering, and reranking
  • Strong cross-lingual understanding capabilities
  • Efficient integration with popular frameworks like Sentence-Transformers and Langchain

Frequently Asked Questions

Q: What makes this model unique?

The model achieves state-of-the-art performance on the MTEB benchmark and provides specialized query instruction handling for improved retrieval performance. It's particularly notable for its balance of efficiency and accuracy.

Q: What are the recommended use cases?

The model excels in semantic search, document retrieval, similarity comparison, and text classification tasks. It's particularly well-suited for production environments requiring high-quality text embeddings.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026