mt0-small

Maintained By
bigscience

MT0-Small: Multilingual Text-to-Text Transfer Transformer

PropertyValue
Parameter Count300M
Model TypeText-to-Text Generation
LicenseApache 2.0
PaperCrosslingual Generalization through Multitask Finetuning
Languages Supported101 languages

What is mt0-small?

MT0-Small is a compact yet powerful multilingual text-to-text transformer model developed by BigScience. It's part of the BLOOMZ & mT0 family, specifically designed for cross-lingual task generalization. The model has been fine-tuned on the xP3 dataset, enabling it to understand and generate text across 101 different languages while maintaining a relatively small footprint of 300M parameters.

Implementation Details

The model is built on the MT5-small architecture and trained using TPUv4-64 hardware with bfloat16 precision. It underwent 25,000 fine-tuning steps processing 4.62 billion tokens, utilizing the T5X framework and Jax for neural network operations.

  • Architecture based on MT5-small design
  • Trained using TPUv4-64 clusters
  • Implements bfloat16 precision for efficient computation
  • Uses T5X and Jax frameworks for training

Core Capabilities

  • Multilingual text generation across 101 languages
  • Zero-shot task generalization
  • Natural language instruction following
  • Cross-lingual transfer learning
  • Support for translation, summarization, and question-answering tasks

Frequently Asked Questions

Q: What makes this model unique?

MT0-Small combines compact size with extensive multilingual capabilities, making it ideal for resource-constrained applications while supporting 101 languages. Its ability to perform zero-shot learning across languages sets it apart from conventional models.

Q: What are the recommended use cases?

The model excels at tasks expressed in natural language, including translation, sentiment analysis, and question answering. It's particularly effective when given clear, well-structured prompts with explicit task instructions and language specifications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.