bge-base-en-v1.5-41-keys-phase-2-v1

Maintained By
RishuD7

bge-base-en-v1.5-41-keys-phase-2-v1

PropertyValue
Parameter Count109M
Model TypeSentence Transformer
Base ModelBAAI/bge-base-en-v1.5
LicenseApache 2.0
Output Dimensions768, 512, 256, 128, 64

What is bge-base-en-v1.5-41-keys-phase-2-v1?

This is an advanced sentence embedding model built on the BGE base architecture, specifically designed for generating high-quality text embeddings with multiple dimensional outputs. It uses an innovative MatryoshkaLoss training approach to create embeddings at various dimensions (768 to 64) simultaneously, making it highly versatile for different applications.

Implementation Details

The model implements a sophisticated architecture combining BERT-based encoding with multiple output dimensions. It utilizes both MatryoshkaLoss and MultipleNegativesRankingLoss during training, enabling efficient representation learning across different dimensional spaces. The model was trained on a dataset of 4,894 samples with specific focus on sentence similarity tasks.

  • Maximum sequence length: 512 tokens
  • Primary embedding dimension: 768
  • Supported reduced dimensions: 512, 256, 128, 64
  • Training utilized cosine similarity optimization

Core Capabilities

  • Sentence similarity computation
  • Semantic textual search
  • Feature extraction at multiple dimensions
  • Cross-lingual text matching (English-focused)
  • Document embedding generation

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its ability to generate embeddings at multiple dimensions simultaneously through MatryoshkaLoss training, allowing flexible usage across different computational constraints while maintaining semantic accuracy.

Q: What are the recommended use cases?

The model excels in semantic search applications, document similarity analysis, and content recommendation systems. It's particularly effective when different dimensional requirements exist across the application stack.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.