opus-mt-tc-big-ar-gmq
Property | Value |
---|---|
Parameter Count | 238M |
License | CC-BY-4.0 |
Architecture | MarianMT Transformer-Big |
Languages | Arabic → Danish, Norwegian, Swedish |
Best BLEU Score | 28.8 (Arabic-Danish) |
What is opus-mt-tc-big-ar-gmq?
This is a specialized neural machine translation model developed by the Helsinki-NLP group for translating Arabic text into North Germanic languages. Built on the MarianMT framework, it's specifically designed to handle translations to Danish, Norwegian, and Swedish with high accuracy.
Implementation Details
The model utilizes a transformer-big architecture trained on the opusTCv20210807 dataset. It requires specific language tokens (e.g., >>swe<<) at the start of input text to indicate the desired target language.
- Implements SentencePiece tokenization with 32k vocabulary
- Supports FP16 precision for efficient inference
- Achieves strong BLEU scores: 28.8 (Danish), 20.5 (Norwegian), 27.3 (Swedish)
Core Capabilities
- Multi-target language translation from Arabic
- Handles complex Arabic linguistic structures
- Efficient batch processing for multiple translations
- Integration with HuggingFace Transformers library
Frequently Asked Questions
Q: What makes this model unique?
The model's ability to handle multiple North Germanic target languages with a single model while maintaining high BLEU scores makes it particularly efficient for Nordic language translations.
Q: What are the recommended use cases?
This model is ideal for applications requiring Arabic to Nordic language translations, such as document translation services, multilingual content management systems, and cross-cultural communication tools.