Granite-3.0-8B-Instruct-GGUF
Property | Value |
---|---|
Parameter Count | 8.17B |
License | Apache 2.0 |
Architecture | Decoder-only dense transformer with GQA and RoPE |
Context Length | 4096 tokens |
Base Model | IBM Granite 3.0 |
What is granite-3.0-8b-instruct-GGUF?
Granite-3.0-8B-Instruct-GGUF is a quantized version of IBM's Granite language model, specifically optimized for instruction-following and chat applications. The model features a robust 8.17B parameter architecture and has been fine-tuned using a combination of open-source instruction datasets and internally collected synthetic data.
Implementation Details
The model is built on a decoder-only dense transformer architecture with several advanced features: GQA attention mechanism, RoPE positional embeddings, and SwiGLU activation functions. With an embedding size of 4096, 40 layers, and 32 attention heads, it offers a powerful foundation for various NLP tasks.
- 4096 embedding dimension
- 40 transformer layers
- 32 attention heads with 8 KV heads
- 12800 MLP hidden size
- SwiGLU activation function
Core Capabilities
- Multilingual support for 12 languages including English, German, Spanish, French, and Japanese
- Strong performance in summarization and text classification
- Advanced code-related tasks and function-calling capabilities
- Retrieval Augmented Generation (RAG) support
- Question-answering with impressive benchmark scores (88.65% on BoolQ)
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its balanced performance across various benchmarks, including impressive scores on MMLU (65.82%), Hellaswag (82.61%), and code-related tasks. It's particularly notable for its multilingual capabilities and efficient quantized format.
Q: What are the recommended use cases?
The model excels in business applications, general instruction-following tasks, code generation and explanation, and multilingual dialogue scenarios. It's particularly well-suited for RAG applications and complex reasoning tasks.