BgGPT-Gemma-2-27B-IT-v1.0
Property | Value |
---|---|
Parameter Count | 27.2B |
Model Type | Causal decoder-only transformer |
Languages | Bulgarian, English |
License | Gemma Terms of Use |
Developer | INSAIT Institute |
What is BgGPT-Gemma-2-27B-IT-v1.0?
BgGPT-Gemma-2-27B-IT-v1.0 is a state-of-the-art bilingual language model developed by INSAIT Institute, built upon Google's Gemma 2 27B architecture. The model was pre-trained on approximately 100 billion tokens, with 85 billion in Bulgarian, using an innovative Branch-and-Merge strategy presented at EMNLP'24.
Implementation Details
The model implements a sophisticated training approach combining continuous pre-training and instruction fine-tuning. It utilizes various data sources including Bulgarian web crawls, Wikipedia, specialized Bulgarian datasets, and machine-translated English content.
- Built on Google's Gemma 2 27B architecture
- Uses BF16 tensor type for optimal performance
- Implements Gemma 2 chat template for interactions
- Supports both eager attention implementation
Core Capabilities
- Exceptional performance in Bulgarian language tasks, outperforming larger models like Qwen 2.5 72B and Llama3.1 70B
- Maintains strong English language capabilities inherited from Gemma 2
- Excels in various benchmarks including Winogrande, Hellaswag, TriviaQA, and GSM-8k
- Demonstrates competitive chat performance against commercial models like Claude Sonnet and GPT-4
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its advanced bilingual capabilities, achieved through the Branch-and-Merge training strategy and extensive pre-training on Bulgarian content while maintaining English proficiency.
Q: What are the recommended use cases?
The model is particularly well-suited for Bulgarian language tasks, bilingual applications, and general text generation in both Bulgarian and English. It performs exceptionally well in educational contexts, knowledge-based tasks, and conversational applications.