NLLB-MoE-54B
Property | Value |
---|---|
License | CC-BY-NC-4.0 |
Paper | No Language Left Behind |
Supported Languages | 196 |
Framework | PyTorch |
What is nllb-moe-54b?
NLLB-MoE-54B is Facebook's state-of-the-art multilingual translation model that implements a Mixture-of-Experts architecture. It represents a significant advancement in machine translation, supporting an impressive 196 languages with various script systems. The model utilizes Expert Output Masking (EOM) during training for improved efficiency and performance.
Implementation Details
The model employs a sophisticated architecture requiring approximately 350GB of storage. It utilizes transformers and implements specific token handling mechanisms, including forced BOS (Beginning of Sequence) tokens for target language identification. The model uses BCP-47 language codes for precise language targeting and supports both low and high-resource languages.
- Implements Expert Output Masking for efficient training
- Supports multiple script systems (Latin, Cyrillic, Arabic, etc.)
- Requires accelerate library for memory-efficient operation
- Uses specialized tokenization for multilingual processing
Core Capabilities
- Direct translation between 196 languages
- Handles both high and low-resource languages
- Support for multiple writing systems and scripts
- Efficient language identification and processing
- Advanced token masking for improved performance
Frequently Asked Questions
Q: What makes this model unique?
The model's Mixture-of-Experts architecture and support for 196 languages make it one of the most comprehensive translation models available. Its ability to handle low-resource languages and multiple script systems sets it apart from traditional translation models.
Q: What are the recommended use cases?
The model is ideal for large-scale multilingual translation tasks, especially when dealing with less common language pairs. It's particularly useful for organizations requiring translation capabilities across a wide range of languages, though the CC-BY-NC-4.0 license restricts commercial use.