ember-v1

Maintained By
llmrails

ember-v1

PropertyValue
Parameter Count335M
ArchitectureBERT-based with RetroMAE and SetFit enhancements
Dimensions1024
Max Sequence Length512 tokens
LicenseMIT

What is ember-v1?

ember-v1 is a cutting-edge text embedding model that has achieved state-of-the-art performance on the Massive Text Embedding Benchmark (MTEB), scoring 63.54 across 56 tasks. The model combines techniques from RetroMAE and SetFit research to create high-quality embeddings for various applications including similarity search, clustering, and classification.

Implementation Details

The model generates 1024-dimensional embeddings and can handle sequences up to 512 tokens in length. It has been trained on a diverse corpus spanning finance, science, medicine, law, and other domains, making it versatile for different applications.

  • Achieves superior performance compared to competitors like bge-large-en-v1.5 and OpenAI's text-embedding-ada-002
  • Implements average pooling strategy for generating embeddings
  • Supports both transformers and sentence-transformers libraries

Core Capabilities

  • Text Classification (91.98% accuracy on Amazon Polarity)
  • Semantic Similarity (87.77% Spearman correlation on STSBenchmark)
  • Information Retrieval (85.51% MAP@10 on Quora Retrieval)
  • Clustering (65.54% V-measure on StackExchange)

Frequently Asked Questions

Q: What makes this model unique?

The model combines advanced training techniques from RetroMAE and SetFit to achieve state-of-the-art performance while maintaining a relatively compact architecture. It outperforms larger models like OpenAI's ada-002 on the MTEB benchmark.

Q: What are the recommended use cases?

ember-v1 excels in semantic search, document clustering, text classification, and similarity assessment. It's particularly effective for English language tasks in professional domains like finance, science, and legal applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.