mt0-xxl

mt0-xxl

bigscience

A powerful 13.9B parameter multilingual text-to-text model capable of following instructions in 101 languages, trained on xP3 dataset for diverse tasks.

PropertyValue
Parameter Count13.9B
LicenseApache 2.0
PaperCrosslingual Generalization through Multitask Finetuning
Languages Supported101 languages
Training DataxP3 and MC4 datasets

What is mt0-xxl?

MT0-XXL is a large-scale multilingual text-to-text transformer model developed by BigScience. It represents a significant advancement in multilingual AI, capable of performing various language tasks across 101 languages. The model is built upon the MT5-XXL architecture and has been specifically fine-tuned on the xP3 dataset, enabling it to follow human instructions in dozens of languages zero-shot.

Implementation Details

The model was trained using TPUv4-256 hardware, implementing bfloat16 precision and completing 7,000 fine-tuning steps with 1.29 billion tokens. It uses the T5X framework and Jax for neural network operations, making it highly efficient for large-scale language processing tasks.

  • Architecture based on MT5-XXL design
  • Trained using TPUv4-256 clusters
  • Implements bfloat16 precision for efficient computation
  • Uses T5X and Jax frameworks

Core Capabilities

  • Multilingual text generation and translation
  • Cross-lingual task generalization
  • Zero-shot learning across languages
  • Natural language instruction following
  • High performance on various NLP benchmarks

Frequently Asked Questions

Q: What makes this model unique?

MT0-XXL stands out for its ability to perform zero-shot cross-lingual generalization and follow instructions in multiple languages. It's been fine-tuned on a diverse set of tasks and languages, making it particularly effective for multilingual applications.

Q: What are the recommended use cases?

The model excels at tasks like translation, text generation, sentiment analysis, and question-answering across multiple languages. It's particularly effective when given clear, well-structured prompts that end with proper punctuation to avoid continuation artifacts.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026