Llama-Krikri-8B-Instruct

ilsp

Greek-English instruction-tuned 8B parameter LLM built on Llama-3.1, optimized for 128k context, strong bilingual capabilities and domain expertise

Property	Value
Base Model	Llama-3.1-8B
Context Length	128k tokens
Developer	ILSP
Primary Languages	Greek, English
Model Hub	Hugging Face

What is Llama-Krikri-8B-Instruct?

Llama-Krikri-8B-Instruct is an advanced bilingual language model specifically designed to excel in both Greek and English language tasks. Built upon Llama-3.1-8B, it underwent extensive training on a diverse corpus of 91 billion tokens, with a strong focus on Greek language content (62.3%). The model represents a significant advancement in Greek language AI capabilities, offering enhanced instruction-following abilities and specialized domain expertise.

Implementation Details

The model was developed through a sophisticated multi-stage process, including extended pretraining and careful fine-tuning. The training data comprised 56.7B Greek tokens, 21B English tokens, 5.5B parallel data tokens, and 7.8B math/code tokens. The instruction tuning process involved two stages of supervised fine-tuning with over 1.4M instruction-response pairs and alignment training using 92,394 preference triplets.

Extended vocabulary with specialized Greek tokens
128k context length supporting approximately 80,000 Greek words
Multi-stage training process with careful bilingual balancing
Comprehensive evaluation across multiple benchmarks

Core Capabilities

Bilingual instruction following and chat capabilities
Document translation across multiple European languages
Domain expertise in legal, financial, medical, and scientific fields
RAG capabilities with 128k context window
Enhanced coding and structured data handling
Chain-of-Thought reasoning and analytical thinking

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its strong bilingual capabilities in Greek and English, achieved through careful training balance and specialized Greek language optimization. It significantly outperforms other models in Greek language tasks while maintaining competitive performance in English.

Q: What are the recommended use cases?

The model excels in bilingual applications, including document translation, specialized domain tasks (legal, medical, financial), content generation, and advanced reasoning tasks. It's particularly suitable for applications requiring deep understanding of Greek language and culture while maintaining English language capabilities.