Large language models (LLMs) have revolutionized how we interact with technology, but their prowess has been largely confined to English. What about the millions of people who speak other languages? Researchers at INSAIT are tackling this challenge head-on with BgGPT, a project dedicated to extending the capabilities of powerful LLMs to Bulgarian. They’re not just translating English LLMs—they’re building a model that truly *understands* and *generates* high-quality Bulgarian text while retaining and even improving the original English capabilities. This is a significant hurdle, as adapting existing LLMs to new languages often leads to a decline in performance in the original language, a phenomenon known as catastrophic forgetting. To combat this, the team is using innovative techniques like Branch-and-Merge, a continual learning strategy that minimizes performance loss while maximizing gains in the new language. They've also curated a massive dataset of over 100 billion tokens of Bulgarian and English text. BgGPT isn’t just a research project—it’s a real-world application. The models power a Bulgarian chat service, making powerful AI accessible to users without specialized hardware. The team has also focused on educational applications, benchmarking BgGPT against state-of-the-art models using exam questions provided by the Bulgarian Ministry of Education. Results are impressive. BgGPT outperforms larger multilingual models like Qwen-2.5 and Llama-3.1 in several key benchmarks, demonstrating its specialized proficiency in Bulgarian. While the primary focus is Bulgarian, the researchers believe their methods can be adapted for other lower-resource languages, opening doors for wider access to powerful AI tools. This is more than just language translation; it's about bridging the digital divide and bringing the transformative power of AI to everyone, regardless of their language.
🍰 Interesting in building your own agents?
PromptLayer provides the tools to manage and monitor prompts with your whole team. Get started for free.
Question & Answers
What is the Branch-and-Merge technique used in BgGPT, and how does it address catastrophic forgetting?
Branch-and-Merge is a continual learning strategy that allows LLMs to learn new languages while preserving existing capabilities. The technique works by creating separate learning pathways ('branches') for new language acquisition while maintaining the original language knowledge base, then carefully merging these pathways to create a unified model. This process involves: 1) Creating a specialized branch for Bulgarian language learning, 2) Training this branch independently to prevent interference with English capabilities, and 3) Strategically merging the branches to maintain performance in both languages. In practice, this allows BgGPT to outperform larger multilingual models while maintaining strong performance in both Bulgarian and English.
How are language models making AI more accessible to non-English speakers?
Language models are democratizing AI access by breaking down language barriers through specialized training and localization. They enable non-English speakers to interact with AI in their native language, access information, and utilize AI tools without requiring English proficiency. Key benefits include improved educational opportunities, better access to digital services, and more inclusive technological advancement. For example, models like BgGPT allow Bulgarian speakers to use AI chatbots, educational tools, and other applications in their native language, making advanced technology accessible to millions more users.
What are the main challenges in developing AI models for less common languages?
Developing AI models for less common languages faces several key challenges: limited available training data, resource constraints, and the risk of performance degradation in other languages. The benefits of addressing these challenges include broader global AI accessibility, preserved cultural diversity in technology, and improved local economic opportunities. Real-world applications include educational tools, customer service, and content creation in local languages. Success stories like BgGPT demonstrate that these challenges can be overcome through innovative techniques and careful dataset curation.
PromptLayer Features
Testing & Evaluation
BgGPT's evaluation against educational benchmarks and comparison with other models aligns with systematic testing needs
Implementation Details
Set up automated testing pipelines using Bulgarian Ministry of Education questions, implement A/B testing between model versions, track performance metrics across language capabilities
Key Benefits
• Standardized evaluation across model iterations
• Automated regression testing for both languages
• Quantifiable performance comparisons