XGLM-1.7B
Property | Value |
---|---|
Parameters | 1.7 Billion |
Training Data | 500B tokens across 31 languages |
License | MIT |
Paper | Few-shot Learning with Multilingual Language Models |
What is XGLM-1.7B?
XGLM-1.7B is a multilingual autoregressive language model developed by Facebook, designed to handle diverse linguistic tasks across 31 different languages. The model represents a significant advancement in multilingual AI, trained on a carefully balanced corpus totaling 500 billion sub-tokens.
Implementation Details
The model employs a transformer-based architecture optimized for multilingual processing. It is implemented using PyTorch and supports both zero-shot and few-shot learning capabilities. The training data distribution is carefully balanced, with English comprising 32.59% of the training data, followed by Russian (6.02%) and Chinese (4.83%).
- Supports 31 languages from diverse language families including Indo-European, Sino-Tibetan, Japonic, and others
- Implements efficient tokenization through the XGLMTokenizer
- Optimized for both high-resource and low-resource languages
Core Capabilities
- Zero-shot cross-lingual transfer learning
- Multilingual text generation and completion
- Natural language understanding across multiple languages
- Few-shot learning for various NLP tasks
- Support for both high-resource and low-resource languages
Frequently Asked Questions
Q: What makes this model unique?
XGLM-1.7B stands out for its balanced multilingual training approach and ability to handle 31 languages efficiently. Unlike many other models that focus primarily on high-resource languages, XGLM-1.7B includes support for low-resource languages like Quechua and Haitian Creole.
Q: What are the recommended use cases?
The model is particularly well-suited for multilingual text generation, cross-lingual transfer learning, and few-shot learning tasks. It excels in scenarios requiring language understanding across multiple languages, making it ideal for international applications and research in multilingual NLP.