inf-retriever-v1-1.5b

Maintained By
infly

INF-Retriever-v1-1.5B

PropertyValue
Parameter Count1.5B
Embedding Dimension1536
Max Input Length32,768 tokens
LanguagesChinese & English (+ effective in other languages)
Model URLhttps://huggingface.co/infly/inf-retriever-v1-1.5b

What is INF-Retriever-v1-1.5B?

INF-Retriever-v1-1.5B is a state-of-the-art lightweight dense retrieval model specifically designed for bilingual information retrieval. Built on the gte-Qwen2-1.5B-instruct architecture, it has been fine-tuned to excel in both Chinese and English retrieval tasks. As of February 2025, it holds the top position on the AIR-Bench leaderboard for models under 7B parameters.

Implementation Details

The model utilizes a sophisticated architecture optimized for retrieval tasks, featuring a 1536-dimensional embedding space and supporting very long context windows up to 32,768 tokens. It can be easily implemented using either Sentence Transformers or the Hugging Face Transformers library, making it accessible for various deployment scenarios.

  • Achieves superior performance in heterogeneous information retrieval across multiple domains
  • Supports efficient processing of both Chinese and English content with state-of-the-art accuracy
  • Demonstrates strong zero-shot capabilities in other languages despite being trained primarily on Chinese and English
  • Optimized for both accuracy and computational efficiency

Core Capabilities

  • Bilingual Excellence: Top-tier performance in both Chinese and English retrieval tasks
  • Domain Versatility: Effective across various domains including healthcare, law, finance, and academic content
  • Long-Context Processing: Handles up to 32K tokens, suitable for lengthy documents
  • High-Dimensional Embeddings: 1536D embedding space for rich semantic representation

Frequently Asked Questions

Q: What makes this model unique?

Its standout feature is achieving state-of-the-art performance in bilingual retrieval while maintaining a relatively small parameter count of 1.5B, making it both efficient and powerful. It ranks #1 on AIR-Bench for models under 7B parameters.

Q: What are the recommended use cases?

The model excels in cross-lingual information retrieval, document search, semantic similarity matching, and content recommendation systems. It's particularly effective for applications requiring bilingual capability in Chinese and English, though it performs well in other languages too.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.