efficient-splade-VI-BT-large-query

Maintained By
naver

Efficient SPLADE VI-BT Large Query Model

PropertyValue
LicenseCC-BY-NC-SA-4.0
PaperView Paper
Performance (MRR@10)38.0 on MS MARCO dev
Inference Latency0.7ms

What is efficient-splade-VI-BT-large-query?

This is a specialized query encoder model that forms part of an efficient dual-architecture system for passage retrieval. It represents the query component of the SPLADE (Sparse Lexical AndMask Distillation) architecture, designed specifically for high-performance information retrieval while maintaining minimal latency.

Implementation Details

The model implements an innovative approach to passage retrieval using sparse representations. It operates in conjunction with a separate document encoder (efficient-splade-VI-BT-large-doc) and employs knowledge distillation techniques to achieve both efficiency and effectiveness. The architecture demonstrates remarkable performance with an MRR@10 of 38.0 on MS MARCO dev set while maintaining an exceptionally low inference latency of 0.7ms.

  • Utilizes BERT-based architecture with specialized modifications
  • Implements L1 regularization for queries
  • Features FLOPS-regularized middle-training
  • Employs bag-of-words representation

Core Capabilities

  • Fast query processing with 0.7ms inference time
  • Achieves 97.8% R@1000 on MS MARCO dev set
  • Optimized for production deployment
  • Competitive performance with traditional BM25 systems

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its exceptional efficiency-performance trade-off, achieving near BM25 latency while maintaining competitive retrieval quality. The separation of query and document encoders allows for optimized inference in production environments.

Q: What are the recommended use cases?

The model is ideal for large-scale information retrieval systems where query latency is critical. It's particularly well-suited for applications requiring real-time search capabilities while maintaining high-quality results.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.