mBART-Large-50

Property	Value
Author	Facebook
License	MIT
Paper	View Research Paper
Downloads	25,122
Languages Supported	50

What is mbart-large-50?

mBART-Large-50 is a sophisticated multilingual sequence-to-sequence model designed for translation tasks across 50 different languages. Developed by Facebook, it builds upon the original mBART architecture by extending support to 25 additional languages, making it one of the most comprehensive multilingual models available.

Implementation Details

The model utilizes a unique multilingual denoising pretraining approach where it processes concatenated data from all supported languages. It implements two key noise schemes: sentence shuffling and span masking, where 35% of words are masked using a Poisson distribution (λ = 3.5). The model requires special formatting with language ID tokens as prefixes in both source and target text.

Supports both PyTorch and TensorFlow implementations
Utilizes special language ID tokens for text formatting
Implements sentence shuffling and span masking techniques
Designed for fine-tuning on translation tasks

Core Capabilities

Multilingual translation across 50 languages
Sequence-to-sequence transformation
Text generation and understanding
Cross-lingual transfer learning
Support for low-resource languages

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to handle 50 different languages simultaneously and its innovative multilingual fine-tuning approach make it unique. Instead of training on single language pairs, it can be fine-tuned on multiple directions simultaneously, making it highly versatile for multilingual applications.

Q: What are the recommended use cases?

The model is primarily designed for machine translation tasks but can be fine-tuned for various multilingual sequence-to-sequence tasks. It's particularly useful for organizations needing to handle translations across multiple language pairs with a single model.

mbart-large-50