metricx-24-hybrid-xxl-v2p6-bfloat16

Maintained By
google

MetricX-24 Hybrid XXL (bfloat16)

PropertyValue
AuthorGoogle
Model TypeTranslation Evaluation
GitHubRepository Link
Average Correlation0.716 (Best in class)

What is metricx-24-hybrid-xxl-v2p6-bfloat16?

MetricX-24 Hybrid XXL is Google's state-of-the-art model for automatic evaluation of translations, submitted to the WMT'24 Metrics Shared Task. This bfloat16 variant offers the same capabilities as the full-precision model but with optimized memory usage. The model is unique in its ability to perform both reference-based and reference-free (quality estimation) evaluation of translations.

Implementation Details

The model is initialized with mT5 and fine-tuned on a combination of direct assessment and MQM data from WMT'15-'22. It implements automatic score clipping between 0 and 25, and includes additional synthetic training examples for handling multi-sentence segments.

  • Supports hybrid evaluation modes (reference-based and reference-free)
  • Achieves 0.865 system-level correlation for en-de translations
  • Implements bfloat16 precision for efficient inference
  • Trained on comprehensive WMT datasets

Core Capabilities

  • High-accuracy translation quality assessment
  • Multi-language support including en-de, en-es, and ja-zh pairs
  • Segment-level and system-level evaluation
  • Optimized for both short and long translations

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its hybrid capabilities, allowing both reference-based and reference-free evaluation in a single model. It achieves state-of-the-art correlation with human judgments and includes special handling for multi-sentence translations.

Q: What are the recommended use cases?

The XXL variant is recommended for applications requiring the highest agreement with human judgments of translation quality. It's particularly suitable for professional translation evaluation systems and research applications where accuracy is paramount.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.