SFR-Embedding-2_R

Maintained By
Salesforce

SFR-Embedding-2_R

PropertyValue
Parameter Count7.11B
LicenseCC-BY-NC-4.0
Tensor TypeBF16
LanguageEnglish

What is SFR-Embedding-2_R?

SFR-Embedding-2_R is an advanced text embedding model developed by Salesforce Research, designed specifically for research applications. Building upon their previous SFR-Embedding work, this model represents a significant advancement in text embedding technology, utilizing a multi-stage training approach to achieve superior performance across various natural language processing tasks.

Implementation Details

The model implements a sophisticated architecture optimized for generating high-quality text embeddings. It supports a maximum sequence length of 4096 tokens and uses BF16 precision for efficient computation. The model can be easily integrated using either the Transformers library or Sentence Transformers framework.

  • Instruction-based embedding generation with task-specific prompts
  • Last-token pooling strategy for embedding extraction
  • Normalized embeddings with cosine similarity scoring
  • Support for both query and passage embedding generation

Core Capabilities

  • Strong performance on MTEB benchmark tasks
  • Excellent results in retrieval tasks (demonstrated by high MAP and MRR scores)
  • Robust classification capabilities (90%+ accuracy on various tasks)
  • Advanced semantic textual similarity assessment
  • Effective clustering and pair classification

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its multi-stage training approach and ability to handle instruction-based embedding generation, making it particularly effective for research applications. Its large parameter count (7.11B) and sophisticated architecture enable superior performance across a wide range of NLP tasks.

Q: What are the recommended use cases?

The model excels in research applications including text retrieval, semantic similarity analysis, document classification, and clustering tasks. It's particularly well-suited for applications requiring high-quality text embeddings with instruction-based customization.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.