opus-mt-tc-big-en-pt

Maintained By
Helsinki-NLP

opus-mt-tc-big-en-pt

PropertyValue
Parameter Count233M
LicenseCC-BY-4.0
ArchitectureTransformer-big (Marian)
BLEU Score50.4 (flores101-devtest)
Release Date2022-03-13

What is opus-mt-tc-big-en-pt?

opus-mt-tc-big-en-pt is a state-of-the-art neural machine translation model designed specifically for translating from English to Portuguese. Part of the OPUS-MT project, it leverages the powerful Marian NMT framework and has been converted to PyTorch using the Hugging Face Transformers library. The model supports both European Portuguese and Brazilian Portuguese variants through language tokens.

Implementation Details

The model utilizes a transformer-big architecture with 233M parameters, trained on the opusTCv20210807+bt dataset. It implements SentencePiece tokenization with 32k vocabulary size and requires target language tokens (>>por<< or >>pob<<) for translation direction specification.

  • FP16 tensor format for efficient inference
  • Multilingual capability with language token support
  • Built on Marian NMT framework
  • Integrated with Hugging Face Transformers

Core Capabilities

  • High-quality English to Portuguese translation (50.4 BLEU on flores101-devtest)
  • Support for both European and Brazilian Portuguese
  • Efficient processing with optimized architecture
  • Easy integration through Hugging Face pipeline API
  • Robust performance on various test sets (49.6 BLEU on Tatoeba)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its high performance (50.4 BLEU score), support for both Portuguese variants, and integration with popular ML frameworks. It's part of a larger initiative to make quality translation accessible for many language pairs.

Q: What are the recommended use cases?

The model is ideal for production-grade English to Portuguese translation tasks, content localization, and applications requiring high-quality translations. It's particularly suitable for scenarios needing distinction between European and Brazilian Portuguese.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.