opus-mt-tc-big-ar-gmq

Maintained By
Helsinki-NLP

opus-mt-tc-big-ar-gmq

PropertyValue
Parameter Count238M
LicenseCC-BY-4.0
ArchitectureMarianMT Transformer-Big
LanguagesArabic → Danish, Norwegian, Swedish
Best BLEU Score28.8 (Arabic-Danish)

What is opus-mt-tc-big-ar-gmq?

This is a specialized neural machine translation model developed by the Helsinki-NLP group for translating Arabic text into North Germanic languages. Built on the MarianMT framework, it's specifically designed to handle translations to Danish, Norwegian, and Swedish with high accuracy.

Implementation Details

The model utilizes a transformer-big architecture trained on the opusTCv20210807 dataset. It requires specific language tokens (e.g., >>swe<<) at the start of input text to indicate the desired target language.

  • Implements SentencePiece tokenization with 32k vocabulary
  • Supports FP16 precision for efficient inference
  • Achieves strong BLEU scores: 28.8 (Danish), 20.5 (Norwegian), 27.3 (Swedish)

Core Capabilities

  • Multi-target language translation from Arabic
  • Handles complex Arabic linguistic structures
  • Efficient batch processing for multiple translations
  • Integration with HuggingFace Transformers library

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to handle multiple North Germanic target languages with a single model while maintaining high BLEU scores makes it particularly efficient for Nordic language translations.

Q: What are the recommended use cases?

This model is ideal for applications requiring Arabic to Nordic language translations, such as document translation services, multilingual content management systems, and cross-cultural communication tools.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.