INF-Retriever-v1-1.5B

Property	Value
Parameter Count	1.5B
Embedding Dimension	1536
Max Input Length	32,768 tokens
Languages	Chinese & English (+ effective in other languages)
Model URL	https://huggingface.co/infly/inf-retriever-v1-1.5b

What is INF-Retriever-v1-1.5B?

INF-Retriever-v1-1.5B is a state-of-the-art lightweight dense retrieval model specifically designed for bilingual information retrieval. Built on the gte-Qwen2-1.5B-instruct architecture, it has been fine-tuned to excel in both Chinese and English retrieval tasks. As of February 2025, it holds the top position on the AIR-Bench leaderboard for models under 7B parameters.

Implementation Details

The model utilizes a sophisticated architecture optimized for retrieval tasks, featuring a 1536-dimensional embedding space and supporting very long context windows up to 32,768 tokens. It can be easily implemented using either Sentence Transformers or the Hugging Face Transformers library, making it accessible for various deployment scenarios.

Achieves superior performance in heterogeneous information retrieval across multiple domains
Supports efficient processing of both Chinese and English content with state-of-the-art accuracy
Demonstrates strong zero-shot capabilities in other languages despite being trained primarily on Chinese and English
Optimized for both accuracy and computational efficiency

Core Capabilities

Bilingual Excellence: Top-tier performance in both Chinese and English retrieval tasks
Domain Versatility: Effective across various domains including healthcare, law, finance, and academic content
Long-Context Processing: Handles up to 32K tokens, suitable for lengthy documents
High-Dimensional Embeddings: 1536D embedding space for rich semantic representation

Frequently Asked Questions

Q: What makes this model unique?

Its standout feature is achieving state-of-the-art performance in bilingual retrieval while maintaining a relatively small parameter count of 1.5B, making it both efficient and powerful. It ranks #1 on AIR-Bench for models under 7B parameters.

Q: What are the recommended use cases?

The model excels in cross-lingual information retrieval, document search, semantic similarity matching, and content recommendation systems. It's particularly effective for applications requiring bilingual capability in Chinese and English, though it performs well in other languages too.