mt0-xxl-mt

Maintained By
bigscience

MT0-XXL-MT Model

PropertyValue
Parameter Count13.9B
LicenseApache 2.0
PaperCrosslingual Generalization through Multitask Finetuning
Languages Supported101 languages
Training DataMC4 and xP3mt datasets

What is mt0-xxl-mt?

MT0-XXL-MT is a large-scale multilingual text-to-text transformer model, part of the BLOOMZ & mT0 family. It's specifically designed for multilingual instruction following and generation tasks across 101 languages. The model was created by fine-tuning the mT5-XXL architecture on the xP3mt dataset, resulting in strong cross-lingual generalization capabilities.

Implementation Details

The model utilizes a transformer-based architecture with 13.9B parameters, trained using TPUv4-256 hardware. It was fine-tuned for 7000 steps on 1.29 billion tokens using bfloat16 precision. The implementation leverages the T5X framework and Jax for neural network operations.

  • Architecture based on mT5-XXL design
  • Trained using state-of-the-art TPUv4-256 infrastructure
  • Implements efficient bfloat16 precision training
  • Uses advanced tokenization for multilingual support

Core Capabilities

  • Multilingual text generation across 101 languages
  • Zero-shot cross-lingual task generalization
  • Natural language instruction following
  • Translation and multilingual text understanding
  • Sentiment analysis and question answering

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to understand and generate text across 101 languages while following instructions in any of these languages makes it particularly powerful for multilingual applications. Its fine-tuning on xP3mt dataset enables zero-shot generalization to unseen tasks and languages.

Q: What are the recommended use cases?

The model excels at tasks expressed in natural language, including translation, cross-lingual question answering, sentiment analysis, and general text generation. It's particularly effective when given clear, well-structured prompts with proper context and language specifications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.