MetricX-23-QE-XL-v2p0

Property	Value
License	Apache 2.0
Framework	PyTorch
Base Architecture	mT5
Paper	MetricX-23: The Google Submission to the WMT 2023 Metrics Shared Task

What is metricx-23-qe-xl-v2p0?

MetricX-23-QE-XL-v2p0 is a reference-free (quality estimation) model for automatic evaluation of machine translations. As part of Google's submission to the WMT'23 Metrics Shared Task, it represents the XL variant of their quality estimation models, offering a balance between performance and computational efficiency.

Implementation Details

The model is built upon the mT5 architecture and fine-tuned on a combination of direct assessment and MQM (Multidimensional Quality Metrics) data. It operates in a reference-free mode, requiring only the source text and translation hypothesis to evaluate translation quality. The model outputs scores in the range of [0, 25], where lower scores indicate better translation quality.

Trained with maximum input length of 1024 tokens
Incorporates synthetic data for robust handling of translation edge cases
Optimized for both system-level and segment-level evaluation

Core Capabilities

Reference-free translation quality assessment
Robust detection of translation issues like under/over-translation
System-level correlation of 0.684 for English-German translations
Segment-level Pearson correlation of 0.421 for English-German pairs
Handles multiple language pairs effectively

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its ability to evaluate translation quality without requiring reference translations, making it particularly valuable for real-world applications where reference translations may not be available. It's also trained with specialized synthetic data to handle common translation issues effectively.

Q: What are the recommended use cases?

The model is ideal for production environments requiring automatic quality estimation of translations, particularly when reference translations are unavailable. It's best suited for scenarios where a balance between computational efficiency and accuracy is needed, as it represents the middle-tier XL version of the MetricX-23 QE family.