BgGPT-Gemma-2-27B-IT-v1.0

Maintained By
INSAIT-Institute

BgGPT-Gemma-2-27B-IT-v1.0

PropertyValue
Parameter Count27.2B
Model TypeCausal decoder-only transformer
LanguagesBulgarian, English
LicenseGemma Terms of Use
DeveloperINSAIT Institute

What is BgGPT-Gemma-2-27B-IT-v1.0?

BgGPT-Gemma-2-27B-IT-v1.0 is a state-of-the-art bilingual language model developed by INSAIT Institute, built upon Google's Gemma 2 27B architecture. The model was pre-trained on approximately 100 billion tokens, with 85 billion in Bulgarian, using an innovative Branch-and-Merge strategy presented at EMNLP'24.

Implementation Details

The model implements a sophisticated training approach combining continuous pre-training and instruction fine-tuning. It utilizes various data sources including Bulgarian web crawls, Wikipedia, specialized Bulgarian datasets, and machine-translated English content.

  • Built on Google's Gemma 2 27B architecture
  • Uses BF16 tensor type for optimal performance
  • Implements Gemma 2 chat template for interactions
  • Supports both eager attention implementation

Core Capabilities

  • Exceptional performance in Bulgarian language tasks, outperforming larger models like Qwen 2.5 72B and Llama3.1 70B
  • Maintains strong English language capabilities inherited from Gemma 2
  • Excels in various benchmarks including Winogrande, Hellaswag, TriviaQA, and GSM-8k
  • Demonstrates competitive chat performance against commercial models like Claude Sonnet and GPT-4

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its advanced bilingual capabilities, achieved through the Branch-and-Merge training strategy and extensive pre-training on Bulgarian content while maintaining English proficiency.

Q: What are the recommended use cases?

The model is particularly well-suited for Bulgarian language tasks, bilingual applications, and general text generation in both Bulgarian and English. It performs exceptionally well in educational contexts, knowledge-based tasks, and conversational applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.