efficient-splade-VI-BT-large-doc

Maintained By
naver

efficient-splade-VI-BT-large-doc

PropertyValue
Licensecc-by-nc-sa-4.0
PaperView Paper
MRR@10 (MS MARCO dev)38.0
Inference Latency0.7ms

What is efficient-splade-VI-BT-large-doc?

This is a specialized document encoder that forms part of a two-model architecture for efficient passage retrieval. It represents the document-side component of the SPLADE (Sparse Lexical AndMask Distillation) architecture, optimized for both performance and efficiency. The model achieves an impressive balance between retrieval quality and computational efficiency, with a MRR@10 of 38.0 on MS MARCO dev set while maintaining extremely low inference latency of 0.7ms.

Implementation Details

The model utilizes a modified DistilBERT architecture with specific optimizations for document encoding. It implements several efficiency-focused techniques including L1 regularization, FLOPS-regularized middle-training, and separate document/query encoders to achieve state-of-the-art performance while maintaining competitive latency.

  • Achieves 97.8% R@1000 on MS MARCO dev set
  • Optimized for sparse representation learning
  • Implements bag-of-words approach with neural enhancement
  • Utilizes knowledge distillation for improved efficiency

Core Capabilities

  • Fast document encoding with 0.7ms inference latency
  • Efficient passage retrieval using sparse representations
  • Competitive performance comparable to traditional BM25
  • Scalable document indexing for large-scale applications

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its exceptional efficiency-performance trade-off, achieving similar latency to traditional BM25 systems while maintaining competitive retrieval performance. The separation of document and query encoders allows for optimized inference speeds.

Q: What are the recommended use cases?

The model is specifically designed for large-scale passage retrieval tasks where both efficiency and effectiveness are crucial. It's particularly well-suited for applications requiring fast document indexing and retrieval with near state-of-the-art performance.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.