Llama-Krikri-8B-Instruct

Maintained By
ilsp

Llama-Krikri-8B-Instruct

PropertyValue
Base ModelLlama-3.1-8B
Context Length128k tokens
DeveloperILSP
Primary LanguagesGreek, English
Model HubHugging Face

What is Llama-Krikri-8B-Instruct?

Llama-Krikri-8B-Instruct is an advanced bilingual language model specifically designed to excel in both Greek and English language tasks. Built upon Llama-3.1-8B, it underwent extensive training on a diverse corpus of 91 billion tokens, with a strong focus on Greek language content (62.3%). The model represents a significant advancement in Greek language AI capabilities, offering enhanced instruction-following abilities and specialized domain expertise.

Implementation Details

The model was developed through a sophisticated multi-stage process, including extended pretraining and careful fine-tuning. The training data comprised 56.7B Greek tokens, 21B English tokens, 5.5B parallel data tokens, and 7.8B math/code tokens. The instruction tuning process involved two stages of supervised fine-tuning with over 1.4M instruction-response pairs and alignment training using 92,394 preference triplets.

  • Extended vocabulary with specialized Greek tokens
  • 128k context length supporting approximately 80,000 Greek words
  • Multi-stage training process with careful bilingual balancing
  • Comprehensive evaluation across multiple benchmarks

Core Capabilities

  • Bilingual instruction following and chat capabilities
  • Document translation across multiple European languages
  • Domain expertise in legal, financial, medical, and scientific fields
  • RAG capabilities with 128k context window
  • Enhanced coding and structured data handling
  • Chain-of-Thought reasoning and analytical thinking

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its strong bilingual capabilities in Greek and English, achieved through careful training balance and specialized Greek language optimization. It significantly outperforms other models in Greek language tasks while maintaining competitive performance in English.

Q: What are the recommended use cases?

The model excels in bilingual applications, including document translation, specialized domain tasks (legal, medical, financial), content generation, and advanced reasoning tasks. It's particularly suitable for applications requiring deep understanding of Greek language and culture while maintaining English language capabilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.