snowflake-arctic-embed-s

Maintained By
Snowflake

Snowflake Arctic Embed S

PropertyValue
Parameter Count33.2M
Embedding Dimension384
LicenseApache 2.0
PaperTechnical Report

What is snowflake-arctic-embed-s?

Snowflake-arctic-embed-s is a compact yet powerful text embedding model designed for enterprise-grade retrieval tasks. Based on the intfloat/e5-small-unsupervised architecture, it achieves state-of-the-art performance in its size category with an MTEB Retrieval Score (NDCG@10) of 51.98, surpassing competitors like bge-small-en-v1.5 and text-embedding-3-small.

Implementation Details

The model employs a multi-stage training pipeline, leveraging about 400M samples of mixed public datasets and proprietary web search data. It's further refined through specialized training on 1M carefully curated triplets of query-positive-negative documents, using hard negative mining techniques.

  • 384-dimensional embeddings for efficient storage and computation
  • 33.2M parameters balancing performance and resource usage
  • Optimized for both accuracy and inference speed
  • Supports up to 512 tokens context length

Core Capabilities

  • State-of-the-art retrieval performance in small model category
  • Efficient encoding of both queries and documents
  • Seamless integration with popular frameworks (Sentence Transformers, Hugging Face)
  • Specialized query prefixing for enhanced retrieval quality

Frequently Asked Questions

Q: What makes this model unique?

The model achieves exceptional retrieval performance despite its compact size, making it ideal for production deployments where both accuracy and efficiency are crucial.

Q: What are the recommended use cases?

The model excels in semantic search, document retrieval, and similarity matching tasks, particularly in enterprise settings where scaling to large datasets is essential while maintaining high accuracy.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.