EuroLLM-9B-Instruct
Property | Value |
---|---|
Parameter Count | 9.154B |
License | Apache License 2.0 |
Sequence Length | 4,096 tokens |
Languages Supported | 35 languages including all EU official languages |
Model URL | https://huggingface.co/utter-project/EuroLLM-9B-Instruct |
What is EuroLLM-9B-Instruct?
EuroLLM-9B-Instruct is a multilingual language model developed through collaboration between major European institutions, specifically designed to excel in European language processing. Trained on 4 trillion tokens across multiple languages, it represents a significant advancement in multilingual AI capabilities, with particular emphasis on EU languages while also supporting additional strategic languages like Arabic, Chinese, and Japanese.
Implementation Details
The model utilizes a sophisticated dense Transformer architecture with several cutting-edge features:
- Grouped Query Attention (GQA) with 8 key-value heads for optimized inference speed
- Pre-layer normalization with RMSNorm for enhanced training stability
- SwiGLU activation function for improved task performance
- Rotary positional embeddings (RoPE) enabling extended context length
- 42 layers with 4,096 embedding size and 12,288 FFN hidden size
Core Capabilities
- Multilingual understanding and generation across 35 languages
- Instruction-tuned using EuroBlocks dataset
- Strong performance in both multilingual and English-specific benchmarks
- Competitive results against models like Gemma-2-9B and Mistral-7B
- Specialized in machine translation and general instruction-following tasks
Frequently Asked Questions
Q: What makes this model unique?
EuroLLM-9B-Instruct stands out for its comprehensive coverage of European languages while maintaining competitive performance with larger models. Its instruction-tuning on EuroBlocks makes it particularly effective for practical applications in European contexts.
Q: What are the recommended use cases?
The model excels in multilingual tasks, machine translation, and general instruction-following scenarios. It's particularly suitable for applications requiring robust performance across European languages, though users should be aware it hasn't been aligned for human preferences.