GemmaX2-28-9B-v0.1
Property | Value |
---|---|
Developer | Xiaomi |
Model Size | 9B parameters |
Base Model | Gemma2-9B |
Training Data | 56B tokens (monolingual & parallel) |
Supported Languages | 28 languages |
Paper | arXiv:2502.02481 |
What is GemmaX2-28-9B-v0.1?
GemmaX2-28-9B-v0.1 is an advanced multilingual translation model developed through a two-stage process: continual pretraining of Gemma2-9B followed by supervised fine-tuning. The model leverages 56 billion tokens of both monolingual and parallel data across 28 different languages, making it a powerful tool for machine translation tasks.
Implementation Details
The model is implemented using the Hugging Face Transformers library and can be easily integrated into existing workflows. It builds upon the Gemma2-9B architecture and has been specifically optimized for translation tasks through careful fine-tuning on high-quality translation instruction data.
- Built on Gemma2-9B architecture
- Employs continuous pretraining methodology
- Optimized for translation between 28 languages
- Implements efficient token processing
Core Capabilities
- High-quality translation across 28 languages including Arabic, Chinese, English, French, German, and more
- Support for both common and less-resourced languages like Khmer, Lao, and Burmese
- Efficient processing of translation tasks
- Integration with popular ML frameworks
Frequently Asked Questions
Q: What makes this model unique?
GemmaX2-28-9B-v0.1 stands out for its extensive language coverage and specialized training approach, combining both monolingual and parallel data in the pretraining phase. The two-stage training process ensures high-quality translations while maintaining computational efficiency.
Q: What are the recommended use cases?
The model is specifically designed for machine translation tasks across its supported 28 languages. It's particularly useful for organizations requiring reliable translation capabilities across multiple Asian, European, and Middle Eastern languages, though it's important to note it's limited to the specific languages it was trained on.