Granite Embedding 30M English

Property	Value
Developer	IBM Granite Team
License	Apache 2.0
Parameters	30M
Embedding Size	384
Max Sequence Length	512 tokens

What is granite-embedding-30m-english?

Granite-embedding-30m-english is a compact yet powerful dense biencoder embedding model developed by IBM Research. It's designed to generate high-quality text embeddings with a focus on enterprise applications while maintaining competitive performance on academic benchmarks. The model produces 384-dimensional embedding vectors and is trained using a sophisticated combination of open-source datasets and IBM-proprietary data.

Implementation Details

The model is built on a RoBERTa-like transformer architecture with 6 layers and 12 attention heads. It features GeLU activation functions and a vocabulary size of 50,265 tokens. The model utilizes CLS pooling for generating embeddings and includes optimizations through retrieval-oriented pretraining, contrastive finetuning, and knowledge distillation.

6 transformer layers with 1536 intermediate size
12 attention heads for robust feature extraction
384-dimensional embedding output
Compatible with both SentenceTransformers and Hugging Face Transformers

Core Capabilities

Text similarity computation with state-of-the-art performance
Efficient retrieval and search applications
49.1% score on MTEB Retrieval benchmark
47.0% performance on CoIR code retrieval tasks
Twice the speed of comparable models

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its efficient architecture that delivers twice the speed of similar models while maintaining competitive performance. It's trained on enterprise-friendly licensed data, making it suitable for commercial applications, notably excluding MS-MARCO dataset due to licensing restrictions.

Q: What are the recommended use cases?

The model excels in text similarity tasks, information retrieval, and search applications. It's particularly well-suited for enterprise environments requiring fast, accurate text embedding generation while maintaining reasonable computational requirements.