dewey_en_beta

Property	Value
Parameter Count	395M
Model Type	Embedding Model
Max Context Length	128k tokens
Embedding Dimension	2048
Model URL	https://huggingface.co/infgrad/dewey_en_beta

What is dewey_en_beta?

dewey_en_beta is an advanced English embedding model developed by infgrad in collaboration with Richinfo. Built on answerdotai/ModernBERT-large architecture, it represents a significant advancement in text embedding capabilities, particularly for long-form content. The model uniquely supports both single-vector and multi-vector embeddings, with the latter implementing a Colbert-like approach but with significantly fewer vectors.

Implementation Details

The model employs a novel training approach that achieves impressive results across various benchmarks. It features a flexible multi-vector combination method where vectors can be understood at span or chunk level rather than token level, allowing for customizable chunking based on specific use cases.

395M parameters with 2048-dimensional embeddings
128k token context window
Support for both single and multi-vector embeddings
Ultra-fast encoding speed thanks to ModernBert architecture
State-of-the-art performance on LongEmbed benchmark (0.86 vs previous SOTA of 0.79)

Core Capabilities

Long-form text embedding with superior performance
Flexible chunk-based multi-vector representations
Competitive performance on MTEB benchmark
Instruction-tuned embedding generation
Efficient processing of documents up to 128k tokens

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to handle both single and multi-vector embeddings, combined with its extraordinary context length of 128k tokens and state-of-the-art performance on long-text tasks, sets it apart from other embedding models.

Q: What are the recommended use cases?

The model excels in long-document retrieval, semantic search, and document similarity tasks. It's particularly well-suited for applications requiring processing of long documents such as legal texts, academic papers, or technical documentation.

dewey_en_beta

dewey_en_beta

What is dewey_en_beta?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models