EuroLLM-1.7B-Instruct
Property | Value |
---|---|
Parameter Count | 1.66B parameters |
License | Apache 2.0 |
Languages Supported | 35 languages |
Paper | EuroLLM: Multilingual Language Models for Europe |
Model Type | Instruction-tuned Multilingual LLM |
What is EuroLLM-1.7B-Instruct?
EuroLLM-1.7B-Instruct is a multilingual language model specifically designed for European languages, developed through collaboration between major European institutions. The model represents a significant achievement in creating accessible AI technology that supports 35 languages, including all official EU languages plus several globally important ones like Arabic, Chinese, and Hindi.
Implementation Details
The model utilizes a dense Transformer architecture with several modern optimizations. It features 24 layers, 2,048 embedding size, and employs Grouped Query Attention (GQA) with 8 key-value heads. The architecture includes pre-layer normalization with RMSNorm, SwiGLU activation function, and rotary positional embeddings (RoPE).
- Training performed on 256 Nvidia H100 GPUs
- 4,096 sequence length capability
- BF16 precision for efficient processing
- Trained on 4 trillion tokens across multiple languages
Core Capabilities
- Strong performance in machine translation, outperforming Gemma-2B
- Competitive results on general benchmarks like Arc Challenge and Hellaswag
- Specialized instruction-following abilities
- Comprehensive coverage of European languages plus major global languages
Frequently Asked Questions
Q: What makes this model unique?
EuroLLM-1.7B-Instruct stands out for its exceptional multilingual capabilities while maintaining a relatively small parameter count. It achieves superior performance in machine translation tasks compared to larger models, making it particularly efficient for European language processing.
Q: What are the recommended use cases?
The model excels in machine translation tasks, especially between European languages. It's also suitable for general instruction-following tasks, making it valuable for applications requiring multilingual understanding and generation, particularly in European contexts.