BLOOM Language Model
Property | Value |
---|---|
Parameter Count | 176B |
License | RAIL License v1.0 |
Paper | Link |
Training Data | 366B tokens |
Languages | 46 natural + 13 programming |
What is BLOOM?
BLOOM (BigScience Large Open-science Open-access Multilingual Language Model) is a groundbreaking 176B parameter language model that represents a significant milestone in open-source AI development. Trained through a collaborative effort led by BigScience, it's designed to be a multilingual powerhouse supporting 46 natural languages and 13 programming languages.
Implementation Details
BLOOM utilizes a decoder-only transformer architecture with several innovative features, including Layer normalization applied to word embeddings (StableEmbedding) and ALiBI positional encodings. The model comprises 70 layers with 112 attention heads and hidden layers of 14336 dimensions.
- Training Infrastructure: 384 A100 80GB GPUs
- Training Location: Jean Zay Supercomputer, France
- Framework: Megatron-DeepSpeed with PyTorch
- Tokenizer: Byte-level BPE with 250,680 vocabulary size
Core Capabilities
- Text generation in 46 languages
- Code generation in 13 programming languages
- Zero-shot task completion
- Text completion and transformation
- Language understanding and generation
Frequently Asked Questions
Q: What makes this model unique?
BLOOM is the first large language model of its size (176B parameters) that is openly available for research purposes. Its multilingual capabilities and open-science approach to development set it apart from other large language models.
Q: What are the recommended use cases?
BLOOM is designed for research purposes, text generation, and as a base model for fine-tuning. It's suitable for tasks like information extraction, question answering, and summarization, though it should not be used for high-stakes decisions or critical applications.