jina-reranker-v1-turbo-en

jinaai

Fast and efficient text reranking model with 37.8M parameters, capable of processing up to 8,192 tokens using JinaBERT architecture

Property	Value
Parameter Count	37.8M parameters
Architecture	JinaBERT with ALiBi
License	Apache 2.0
Paper	JinaBERT Paper
Maximum Sequence Length	8,192 tokens

What is jina-reranker-v1-turbo-en?

jina-reranker-v1-turbo-en is a high-performance text reranking model designed for blazing-fast operation while maintaining competitive accuracy. Built on the innovative JinaBERT architecture, it represents a balanced compromise between speed and effectiveness with its 6-layer structure and 37.8M parameters.

Implementation Details

The model employs knowledge distillation techniques, learning from a larger teacher model (jina-reranker-v1-base-en) to maintain high accuracy while significantly improving inference speed. It utilizes a symmetric bidirectional variant of ALiBi, enabling it to process sequences up to 8,192 tokens in length.

6-layer architecture with 384 hidden size
Knowledge distillation for optimal performance
BF16 tensor type for efficient processing
Supports multiple integration methods including API, sentence-transformers, and transformers.js

Core Capabilities

Extended sequence length processing (up to 8,192 tokens)
Competitive NDCG@10 score of 49.60 on BEIR datasets
85.13% Hit Rate on LlamaIndex RAG tasks
Multilingual support focusing on English content
Compatible with various deployment environments including browser-based applications

Frequently Asked Questions

Q: What makes this model unique?

The model's unique value proposition lies in its optimal balance between speed and accuracy, achieved through knowledge distillation and the innovative JinaBERT architecture. Its ability to handle long sequences up to 8,192 tokens sets it apart from traditional rerankers limited to 512 tokens.

Q: What are the recommended use cases?

The model is ideal for search and retrieval systems requiring fast reranking of results, RAG (Retrieval-Augmented Generation) applications, and any scenario where quick but accurate document relevance scoring is needed. It's particularly suitable for production environments where processing speed is crucial but accuracy cannot be significantly compromised.