Granite Embedding 30M English
Property | Value |
---|---|
Developer | IBM Granite Team |
License | Apache 2.0 |
Parameters | 30M |
Embedding Size | 384 |
Max Sequence Length | 512 tokens |
What is granite-embedding-30m-english?
Granite-embedding-30m-english is a compact yet powerful dense biencoder embedding model developed by IBM Research. It's designed to generate high-quality text embeddings with a focus on enterprise applications while maintaining competitive performance on academic benchmarks. The model produces 384-dimensional embedding vectors and is trained using a sophisticated combination of open-source datasets and IBM-proprietary data.
Implementation Details
The model is built on a RoBERTa-like transformer architecture with 6 layers and 12 attention heads. It features GeLU activation functions and a vocabulary size of 50,265 tokens. The model utilizes CLS pooling for generating embeddings and includes optimizations through retrieval-oriented pretraining, contrastive finetuning, and knowledge distillation.
- 6 transformer layers with 1536 intermediate size
- 12 attention heads for robust feature extraction
- 384-dimensional embedding output
- Compatible with both SentenceTransformers and Hugging Face Transformers
Core Capabilities
- Text similarity computation with state-of-the-art performance
- Efficient retrieval and search applications
- 49.1% score on MTEB Retrieval benchmark
- 47.0% performance on CoIR code retrieval tasks
- Twice the speed of comparable models
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its efficient architecture that delivers twice the speed of similar models while maintaining competitive performance. It's trained on enterprise-friendly licensed data, making it suitable for commercial applications, notably excluding MS-MARCO dataset due to licensing restrictions.
Q: What are the recommended use cases?
The model excels in text similarity tasks, information retrieval, and search applications. It's particularly well-suited for enterprise environments requiring fast, accurate text embedding generation while maintaining reasonable computational requirements.