Snowflake Arctic-embed-m-v1.5
Property | Value |
---|---|
Parameter Count | 109M |
License | Apache 2.0 |
Paper | arXiv:2407.18887 |
MTEB Retrieval Score | 55.14 (NDCG@10) |
What is snowflake-arctic-embed-m-v1.5?
Snowflake's arctic-embed-m-v1.5 is an advanced text embedding model designed to generate highly compressible embedding vectors while maintaining exceptional retrieval quality. This model represents a significant improvement over its predecessor, achieving better performance even when compressed to as little as 128 bytes per vector through innovative compression techniques.
Implementation Details
The model employs a sophisticated combination of Matryoshka Representation Learning (MRL) and uniform scalar quantization to achieve its impressive compression capabilities. It generates 768-dimensional vectors that can be effectively truncated to 256 dimensions while preserving semantic meaning.
- Achieves 98% of original performance even at 128-byte compression (24x reduction)
- Supports both 4-bit and 8-bit uniform scalar quantization
- Optimized ranges: -0.18 to +0.18 for 4-bit, -0.3 to +0.3 for 8-bit quantization
- Compatible with popular frameworks including Sentence Transformers and Hugging Face Transformers
Core Capabilities
- State-of-the-art embedding generation with 55.14 MTEB Retrieval Score
- Superior compression maintaining 99% quality at 256 dimensions
- Efficient storage with up to 7.8M vectors per GB at maximum compression
- Cross-framework compatibility and easy integration
Frequently Asked Questions
Q: What makes this model unique?
The model's ability to maintain high performance even under extreme compression sets it apart. It achieves better retrieval quality than larger models like Google's gecko (1200M parameters) when compressed to 256 dimensions.
Q: What are the recommended use cases?
The model is ideal for large-scale retrieval systems where storage efficiency is crucial. It's particularly well-suited for applications requiring high-quality semantic search with minimal storage overhead.