GemmaX2-28-9B-v0.1

Property	Value
Developer	Xiaomi
Model Size	9B parameters
Base Model	Gemma2-9B
Training Data	56B tokens (monolingual & parallel)
Supported Languages	28 languages
Paper	arXiv:2502.02481

What is GemmaX2-28-9B-v0.1?

GemmaX2-28-9B-v0.1 is an advanced multilingual translation model developed through a two-stage process: continual pretraining of Gemma2-9B followed by supervised fine-tuning. The model leverages 56 billion tokens of both monolingual and parallel data across 28 different languages, making it a powerful tool for machine translation tasks.

Implementation Details

The model is implemented using the Hugging Face Transformers library and can be easily integrated into existing workflows. It builds upon the Gemma2-9B architecture and has been specifically optimized for translation tasks through careful fine-tuning on high-quality translation instruction data.

Built on Gemma2-9B architecture
Employs continuous pretraining methodology
Optimized for translation between 28 languages
Implements efficient token processing

Core Capabilities

High-quality translation across 28 languages including Arabic, Chinese, English, French, German, and more
Support for both common and less-resourced languages like Khmer, Lao, and Burmese
Efficient processing of translation tasks
Integration with popular ML frameworks

Frequently Asked Questions

Q: What makes this model unique?

GemmaX2-28-9B-v0.1 stands out for its extensive language coverage and specialized training approach, combining both monolingual and parallel data in the pretraining phase. The two-stage training process ensures high-quality translations while maintaining computational efficiency.

Q: What are the recommended use cases?

The model is specifically designed for machine translation tasks across its supported 28 languages. It's particularly useful for organizations requiring reliable translation capabilities across multiple Asian, European, and Middle Eastern languages, though it's important to note it's limited to the specific languages it was trained on.

GemmaX2-28-9B-v0.1

GemmaX2-28-9B-v0.1

What is GemmaX2-28-9B-v0.1?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models