stella_en_400M_v5

Property	Value
Parameter Count	435M
License	MIT
Paper	MRL Paper
Base Model	Alibaba-NLP/gte-large-en-v1.5

What is stella_en_400M_v5?

stella_en_400M_v5 is an advanced sentence embedding model trained using Multiple Representation Learning (MRL), offering flexible dimension options ranging from 512 to 8192. Built upon Alibaba-NLP's GTE architecture, it simplifies prompt engineering by providing two main prompts for sentence-to-passage (s2p) and sentence-to-sentence (s2s) tasks.

Implementation Details

The model implements a unique architecture that supports multiple embedding dimensions through separate linear projection layers. It achieves state-of-the-art performance on the MTEB benchmark, with the 1024-dimension version performing nearly as well as the 8192-dimension version.

Supports multiple embedding dimensions: 512, 768, 1024, 2048, 4096, 6144, 8192
Maximum sequence length of 512 tokens
Implements memory-efficient attention mechanisms
Compatible with both SentenceTransformers and Transformers libraries

Core Capabilities

Semantic text similarity assessment
Passage retrieval and ranking
Document clustering
Pair classification tasks
Cross-encoder capabilities for various NLP tasks

Frequently Asked Questions

Q: What makes this model unique?

The model's key innovation lies in its implementation of Multiple Representation Learning, allowing for flexible dimension choices while maintaining high performance. The simplified prompting system with just two main prompts makes it particularly user-friendly.

Q: What are the recommended use cases?

The model excels in retrieval tasks, semantic similarity matching, and document classification. The 1024-dimension version is recommended for most applications, offering an optimal balance between performance and computational efficiency.

stella_en_400M_v5

stella_en_400M_v5

What is stella_en_400M_v5?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models