GTE-Multilingual-Base
Property | Value |
---|---|
Parameter Count | 305M |
Embedding Dimension | 768 |
Max Input Length | 8192 tokens |
License | Apache 2.0 |
Paper | mGTE Paper |
What is gte-multilingual-base?
GTE-multilingual-base is a state-of-the-art text embedding model designed for multilingual applications. It represents the latest advancement in the GTE (General Text Embedding) family, offering exceptional performance across multiple languages while maintaining efficiency. The model employs an encoder-only transformer architecture, enabling faster inference and lower hardware requirements compared to decoder-based alternatives.
Implementation Details
The model generates embeddings of dimension 768 and can process texts up to 8192 tokens in length. It supports both dense and sparse vector representations, allowing for flexible deployment scenarios and optimal storage utilization. The implementation includes support for popular frameworks like transformers and sentence-transformers.
- Encoder-only architecture for efficient processing
- Support for elastic dense embeddings
- Integration with xformers for acceleration
- Compatible with text-embeddings-inference (TEI)
Core Capabilities
- Multilingual support for 70+ languages
- State-of-the-art performance in retrieval tasks
- Hybrid dense and sparse vector generation
- Long context handling up to 8192 tokens
- Efficient inference with 10x speed improvement over decoder-based models
Frequently Asked Questions
Q: What makes this model unique?
The model combines multilingual capabilities with efficient architecture, supporting both dense and sparse representations while maintaining high performance across various tasks. Its ability to handle long contexts and support for elastic embeddings sets it apart from similar models.
Q: What are the recommended use cases?
The model excels in multilingual information retrieval, cross-lingual search, document similarity comparison, and general text representation tasks. It's particularly suitable for applications requiring efficient processing of multilingual content with long context windows.