MetricX-23-QE-XL-v2p0
Property | Value |
---|---|
License | Apache 2.0 |
Framework | PyTorch |
Base Architecture | mT5 |
Paper | MetricX-23: The Google Submission to the WMT 2023 Metrics Shared Task |
What is metricx-23-qe-xl-v2p0?
MetricX-23-QE-XL-v2p0 is a reference-free (quality estimation) model for automatic evaluation of machine translations. As part of Google's submission to the WMT'23 Metrics Shared Task, it represents the XL variant of their quality estimation models, offering a balance between performance and computational efficiency.
Implementation Details
The model is built upon the mT5 architecture and fine-tuned on a combination of direct assessment and MQM (Multidimensional Quality Metrics) data. It operates in a reference-free mode, requiring only the source text and translation hypothesis to evaluate translation quality. The model outputs scores in the range of [0, 25], where lower scores indicate better translation quality.
- Trained with maximum input length of 1024 tokens
- Incorporates synthetic data for robust handling of translation edge cases
- Optimized for both system-level and segment-level evaluation
Core Capabilities
- Reference-free translation quality assessment
- Robust detection of translation issues like under/over-translation
- System-level correlation of 0.684 for English-German translations
- Segment-level Pearson correlation of 0.421 for English-German pairs
- Handles multiple language pairs effectively
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its ability to evaluate translation quality without requiring reference translations, making it particularly valuable for real-world applications where reference translations may not be available. It's also trained with specialized synthetic data to handle common translation issues effectively.
Q: What are the recommended use cases?
The model is ideal for production environments requiring automatic quality estimation of translations, particularly when reference translations are unavailable. It's best suited for scenarios where a balance between computational efficiency and accuracy is needed, as it represents the middle-tier XL version of the MetricX-23 QE family.