EuroLLM-9B
Property | Value |
---|---|
Parameter Count | 9.154B |
License | Apache License 2.0 |
Languages | 34 languages including all EU languages |
Model URL | HuggingFace |
What is EuroLLM-9B?
EuroLLM-9B is a state-of-the-art multilingual language model developed through collaboration between leading European institutions. Trained on 4 trillion tokens across 34 languages, it represents a significant advancement in multilingual AI capabilities, particularly focusing on European languages.
Implementation Details
The model employs a dense Transformer architecture with several modern optimizations, including Grouped Query Attention (GQA) with 8 key-value heads, pre-layer normalization with RMSNorm, and SwiGLU activation function. It features a 4,096 token context length and was trained using 400 Nvidia H100 GPUs.
- 42 layers with 4,096 embedding size
- 32 attention heads with 8 KV heads for GQA
- 12,288 FFN hidden size
- RoPE positional encodings
- Trained with BF16 precision
Core Capabilities
- Multilingual understanding and generation across 34 languages
- Strong performance in both multilingual and English-specific benchmarks
- Comparable performance to Gemma-2-9B and Mistral-7B
- Specialized in EU language processing
Frequently Asked Questions
Q: What makes this model unique?
EuroLLM-9B stands out for its comprehensive coverage of EU languages and competitive performance against larger models, achieving superior results in multilingual tasks while maintaining strong English language capabilities.
Q: What are the recommended use cases?
The model is well-suited for multilingual text generation, understanding, and processing tasks, particularly in European language contexts. However, it's important to note that it hasn't been aligned to human preferences, so should be used with appropriate content filtering.