Teuken-7B-instruct-research-v0.4
Property | Value |
---|---|
Parameter Count | 7.45B |
Model Type | Transformer-based decoder-only |
Languages | 24 EU languages |
License | Other (Custom) |
Paper | Research Paper |
What is Teuken-7B-instruct-research-v0.4?
Teuken-7B-instruct-research-v0.4 is an advanced multilingual language model specifically designed for European languages. Developed by a consortium including Fraunhofer, Forschungszentrum Jülich, TU Dresden, and DFKI, it represents a significant step forward in European language AI technology. The model was pre-trained on 4 trillion tokens and has been instruction-tuned to handle complex language tasks across 24 EU languages.
Implementation Details
The model utilizes a sophisticated architecture with 32 layers, 4096 hidden size, and 32 attention heads. It implements Group Query Attention with 2 query groups and uses the SwiGLU activation function. Training was conducted using bfloat16 precision on JUWELS Booster with NVIDIA A100 GPUs.
- Sequence length: 4096 tokens
- Rotary position embeddings
- RMSNorm normalization
- Zero dropout implementation
Core Capabilities
- Multilingual understanding and generation across 24 EU languages
- Strong performance on benchmarks like EU21-ARC, EU21-HeSw, and EU21-TQA
- Specialized for European context and values
- Instruction-following capabilities in multiple languages
Frequently Asked Questions
Q: What makes this model unique?
The model's primary distinction lies in its comprehensive coverage of all 24 EU languages and its specific optimization for European content. Unlike English-centric models, it provides more balanced performance across European languages and better reflects European cultural values.
Q: What are the recommended use cases?
The model is specifically designed for research applications involving multilingual tasks across EU languages. It excels in instruction-following scenarios but is not recommended for mathematical computations or coding tasks.