UAE-Large-V1
Property | Value |
---|---|
Parameter Count | 335M |
License | MIT |
Paper | AnglE-optimized Text Embeddings |
Framework | PyTorch (Transformers) |
What is UAE-Large-V1?
UAE-Large-V1 is a state-of-the-art sentence embedding model that achieves exceptional performance on the MTEB benchmark with an average score of 64.64. Developed by WhereIsAI, this model specializes in generating high-quality text embeddings for various NLP tasks including retrieval, classification, and clustering.
Implementation Details
The model implements the AnglE architecture and can be easily used through either the angle_emb package or sentence-transformers framework. It supports different pooling strategies with CLS pooling being the recommended approach for optimal performance.
- Optimized for both retrieval and non-retrieval tasks
- Supports batch processing and GPU acceleration
- Implements efficient prompt-based retrieval using the Prompts.C format
- Compatible with multiple deployment options including Docker through Infinity
Core Capabilities
- Semantic Textual Similarity (STS) with high correlation scores (>85% on multiple benchmarks)
- Classification tasks with impressive accuracy (>90% on various datasets)
- Information Retrieval with strong MAP and MRR metrics
- Clustering capabilities with robust v-measure scores
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its universal applicability across different NLP tasks while maintaining SOTA performance, particularly in sentence embedding generation. Its AnglE optimization technique provides superior results compared to traditional approaches.
Q: What are the recommended use cases?
The model excels in semantic search, document similarity comparison, text classification, and clustering applications. It's particularly effective when used with the provided prompt templates for retrieval tasks.