opus-tatoeba-en-tr

opus-tatoeba-en-tr

Helsinki-NLP

English-Turkish translation model trained on OPUS data, achieving 41.5 BLEU on Tatoeba test set. Uses transformer-align architecture with SentencePiece tokenization.

PropertyValue
Model TypeTransformer-align
LanguagesEnglish → Turkish
Training DataOPUS + Tatoeba
Release DateApril 10, 2021
Best BLEU Score41.5 (Tatoeba test set)

What is opus-tatoeba-en-tr?

The opus-tatoeba-en-tr is a specialized machine translation model developed by Helsinki-NLP for translating between English and Turkish. Built using the transformer-align architecture, this model has demonstrated strong performance particularly on the Tatoeba test set, achieving a BLEU score of 41.5 and a chrF score of 0.684.

Implementation Details

The model implements a transformer-align architecture with specialized preprocessing that includes normalization and SentencePiece tokenization (spm32k,spm32k). It was trained on the OPUS dataset supplemented with back-translated data, as indicated by the '+bt' in the model version.

  • Pre-processing: Normalization + SentencePiece (32k vocabulary)
  • Architecture: Transformer-align
  • Training Data: OPUS corpus + back-translated data
  • Evaluation: Multiple test sets including news and Tatoeba

Core Capabilities

  • High-quality English to Turkish translation
  • Strong performance on general domain text (Tatoeba: 41.5 BLEU)
  • Consistent performance on news domain (News Test 2017: 22.8 BLEU)
  • Robust handling of various text types

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized focus on English-Turkish translation and its impressive performance on the Tatoeba test set. The combination of transformer-align architecture with careful preprocessing yields particularly strong results for this language pair.

Q: What are the recommended use cases?

The model is well-suited for general-purpose English to Turkish translation, showing particularly strong performance on everyday language (as evidenced by Tatoeba scores) and reasonable performance on news content. It's ideal for applications requiring reliable English-Turkish translation capabilities.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026