opus-mt-tc-big-el-en

Property	Value
Architecture	Transformer-big
Source Language	Modern Greek (1453-)
Target Language	English
Release Date	2022-02-25
Tokenization	SentencePiece (32k)
BLEU Score	68.8 (Tatoeba test)
Paper	OPUS-MT Paper

What is opus-mt-tc-big-el-en?

opus-mt-tc-big-el-en is a neural machine translation model specifically designed for translating Modern Greek text to English. Developed by Helsinki-NLP as part of the OPUS-MT project, this model leverages the transformer-big architecture and was trained on the comprehensive OPUS corpus with additional back-translated data (opusTCv20210807+bt).

Implementation Details

The model is implemented using the Marian NMT framework and has been converted to PyTorch using the Hugging Face transformers library. It utilizes SentencePiece tokenization with a vocabulary size of 32k for both source and target languages.

Built on transformer-big architecture for enhanced performance
Uses SentencePiece tokenization (32k vocabulary)
Trained on OPUS corpus with back-translation augmentation
Achieves 68.8 BLEU score on Tatoeba test set
Shows 33.9 BLEU score on flores101-devtest

Core Capabilities

High-quality Greek to English translation
Supports batch translation
Compatible with Hugging Face transformers pipeline
Optimized for production deployment
Excellent performance on general-domain text

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its impressive performance on Greek-to-English translation, achieving a notable 68.8 BLEU score on the Tatoeba test set. It's part of the larger OPUS-MT initiative to democratize machine translation across many languages.

Q: What are the recommended use cases?

The model is ideal for translating Modern Greek text to English in various applications, including content localization, document translation, and automated translation services. It's particularly effective for general-domain text as demonstrated by its strong performance on standardized test sets.