Efficient SPLADE VI-BT Large Query Model
Property | Value |
---|---|
License | CC-BY-NC-SA-4.0 |
Paper | View Paper |
Performance (MRR@10) | 38.0 on MS MARCO dev |
Inference Latency | 0.7ms |
What is efficient-splade-VI-BT-large-query?
This is a specialized query encoder model that forms part of an efficient dual-architecture system for passage retrieval. It represents the query component of the SPLADE (Sparse Lexical AndMask Distillation) architecture, designed specifically for high-performance information retrieval while maintaining minimal latency.
Implementation Details
The model implements an innovative approach to passage retrieval using sparse representations. It operates in conjunction with a separate document encoder (efficient-splade-VI-BT-large-doc) and employs knowledge distillation techniques to achieve both efficiency and effectiveness. The architecture demonstrates remarkable performance with an MRR@10 of 38.0 on MS MARCO dev set while maintaining an exceptionally low inference latency of 0.7ms.
- Utilizes BERT-based architecture with specialized modifications
- Implements L1 regularization for queries
- Features FLOPS-regularized middle-training
- Employs bag-of-words representation
Core Capabilities
- Fast query processing with 0.7ms inference time
- Achieves 97.8% R@1000 on MS MARCO dev set
- Optimized for production deployment
- Competitive performance with traditional BM25 systems
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its exceptional efficiency-performance trade-off, achieving near BM25 latency while maintaining competitive retrieval quality. The separation of query and document encoders allows for optimized inference in production environments.
Q: What are the recommended use cases?
The model is ideal for large-scale information retrieval systems where query latency is critical. It's particularly well-suited for applications requiring real-time search capabilities while maintaining high-quality results.