Llama-Krikri-8B-Instruct
Property | Value |
---|---|
Base Model | Llama-3.1-8B |
Context Length | 128k tokens |
Developer | ILSP |
Primary Languages | Greek, English |
Model Hub | Hugging Face |
What is Llama-Krikri-8B-Instruct?
Llama-Krikri-8B-Instruct is an advanced bilingual language model specifically designed to excel in both Greek and English language tasks. Built upon Llama-3.1-8B, it underwent extensive training on a diverse corpus of 91 billion tokens, with a strong focus on Greek language content (62.3%). The model represents a significant advancement in Greek language AI capabilities, offering enhanced instruction-following abilities and specialized domain expertise.
Implementation Details
The model was developed through a sophisticated multi-stage process, including extended pretraining and careful fine-tuning. The training data comprised 56.7B Greek tokens, 21B English tokens, 5.5B parallel data tokens, and 7.8B math/code tokens. The instruction tuning process involved two stages of supervised fine-tuning with over 1.4M instruction-response pairs and alignment training using 92,394 preference triplets.
- Extended vocabulary with specialized Greek tokens
- 128k context length supporting approximately 80,000 Greek words
- Multi-stage training process with careful bilingual balancing
- Comprehensive evaluation across multiple benchmarks
Core Capabilities
- Bilingual instruction following and chat capabilities
- Document translation across multiple European languages
- Domain expertise in legal, financial, medical, and scientific fields
- RAG capabilities with 128k context window
- Enhanced coding and structured data handling
- Chain-of-Thought reasoning and analytical thinking
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its strong bilingual capabilities in Greek and English, achieved through careful training balance and specialized Greek language optimization. It significantly outperforms other models in Greek language tasks while maintaining competitive performance in English.
Q: What are the recommended use cases?
The model excels in bilingual applications, including document translation, specialized domain tasks (legal, medical, financial), content generation, and advanced reasoning tasks. It's particularly suitable for applications requiring deep understanding of Greek language and culture while maintaining English language capabilities.