mBART-Large-50
Property | Value |
---|---|
Author | |
License | MIT |
Paper | View Research Paper |
Downloads | 25,122 |
Languages Supported | 50 |
What is mbart-large-50?
mBART-Large-50 is a sophisticated multilingual sequence-to-sequence model designed for translation tasks across 50 different languages. Developed by Facebook, it builds upon the original mBART architecture by extending support to 25 additional languages, making it one of the most comprehensive multilingual models available.
Implementation Details
The model utilizes a unique multilingual denoising pretraining approach where it processes concatenated data from all supported languages. It implements two key noise schemes: sentence shuffling and span masking, where 35% of words are masked using a Poisson distribution (λ = 3.5). The model requires special formatting with language ID tokens as prefixes in both source and target text.
- Supports both PyTorch and TensorFlow implementations
- Utilizes special language ID tokens for text formatting
- Implements sentence shuffling and span masking techniques
- Designed for fine-tuning on translation tasks
Core Capabilities
- Multilingual translation across 50 languages
- Sequence-to-sequence transformation
- Text generation and understanding
- Cross-lingual transfer learning
- Support for low-resource languages
Frequently Asked Questions
Q: What makes this model unique?
The model's ability to handle 50 different languages simultaneously and its innovative multilingual fine-tuning approach make it unique. Instead of training on single language pairs, it can be fine-tuned on multiple directions simultaneously, making it highly versatile for multilingual applications.
Q: What are the recommended use cases?
The model is primarily designed for machine translation tasks but can be fine-tuned for various multilingual sequence-to-sequence tasks. It's particularly useful for organizations needing to handle translations across multiple language pairs with a single model.