ms-marco-TinyBERT-L2

ms-marco-TinyBERT-L2

cross-encoder

A lightweight cross-encoder model trained on MS Marco passage ranking, processes 9000 docs/sec with NDCG@10 of 67.43 on TREC DL19 and MRR@10 of 30.15 on MS Marco Dev.

PropertyValue
Authorcross-encoder
Processing Speed9000 docs/sec
NDCG@10 (TREC DL 19)67.43
MRR@10 (MS Marco Dev)30.15
Model HubHugging Face

What is ms-marco-TinyBERT-L2?

ms-marco-TinyBERT-L2 is a lightweight cross-encoder model specifically designed for passage ranking tasks. It represents a careful balance between computational efficiency and performance, being part of the first generation of MS Marco cross-encoders. The model excels in information retrieval scenarios where speed is crucial while maintaining reasonable accuracy.

Implementation Details

The model can be implemented using either SentenceTransformers or the Transformers library. It processes query-passage pairs to produce relevance scores, making it ideal for re-ranking applications. The model has been optimized to handle high-throughput scenarios, achieving impressive processing speeds of 9000 documents per second on a V100 GPU.

  • Compatible with both SentenceTransformers and Transformers libraries
  • Optimized for query-passage pair scoring
  • High-throughput processing capability
  • Efficient resource utilization

Core Capabilities

  • Fast passage re-ranking for information retrieval
  • Query-passage relevance scoring
  • Integration with existing search systems
  • Efficient processing of large document collections

Frequently Asked Questions

Q: What makes this model unique?

The model's main strength lies in its exceptional processing speed while maintaining reasonable performance metrics. At 9000 documents per second, it's one of the fastest models in the MS Marco suite, making it ideal for large-scale applications where processing speed is crucial.

Q: What are the recommended use cases?

This model is best suited for applications requiring quick passage re-ranking, such as search engines, document retrieval systems, and any scenario where fast processing of query-passage pairs is needed. It's particularly valuable in production environments where speed is prioritized over maximum accuracy.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026