nomic-embed-text-v2-moe

Property	Value
Total Parameters	475M (305M active during inference)
Architecture Type	Mixture of Experts (MoE)
Embedding Dimensions	768 (reducible to 256)
Maximum Sequence Length	512 tokens
Paper	arXiv:2502.07972

What is nomic-embed-text-v2-moe?

nomic-embed-text-v2-moe is a cutting-edge multilingual text embedding model that leverages a Mixture of Experts architecture to deliver state-of-the-art performance across approximately 100 languages. Trained on 1.6B high-quality pairs, it offers exceptional versatility through its Matryoshka Embedding technology, allowing for flexible dimension reduction without significant performance loss.

Implementation Details

The model employs an 8-expert architecture with top-2 routing, strategically balancing computational efficiency with performance. It features a unique approach to embedding generation, requiring specific task instruction prefixes ('search_query:' or 'search_document:') for optimal performance.

Supports flexible embedding dimensions (768 to 256) through Matryoshka representation learning
Implements 8 experts with top-2 routing for efficient processing
Achieves SOTA performance on BEIR (52.86) and MIRACL (65.80) benchmarks
Includes fully open-source weights, code, and training data

Core Capabilities

Multilingual support for ~100 languages
High-performance text embeddings with competitive results against larger models
Storage efficiency through flexible dimension reduction
Robust performance in multilingual retrieval tasks
Easy integration with both Transformers and SentenceTransformers frameworks

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its combination of Mixture of Experts architecture with Matryoshka Embeddings, allowing for flexible dimension reduction while maintaining high performance across multiple languages. It achieves this while being fully open-source and competitive with models twice its size.

Q: What are the recommended use cases?

The model excels in multilingual text retrieval, semantic search, and document similarity tasks. It's particularly well-suited for applications requiring efficient storage through dimension reduction while maintaining high performance across multiple languages.