BLOOM-7B1 Language Model

Property	Value
Parameters	7.07B
License	RAIL 1.0
Languages	46+ (including 45 natural languages)
Training Data	1.5TB of text
Architecture	Decoder-only Transformer

What is BLOOM-7B1?

BLOOM-7B1 is a state-of-the-art multilingual language model developed by BigScience, representing a significant achievement in open-science AI development. With 7.07 billion parameters, it's designed to democratize access to large language models while supporting an impressive array of 46+ languages, including many low-resource languages from Africa and India.

Implementation Details

The model utilizes a modified Megatron-LM GPT2 architecture with several innovative features. It employs 30 layers with 32 attention heads, operates with hidden layers of 4096 dimensions, and uses a sequence length of 2048 tokens. The implementation includes ALiBI positional encodings and GeLU activation functions.

Trained on Jean Zay Supercomputer using 384 A100 80GB GPUs
Uses a byte-level BPE tokenizer with 250,680 vocabulary size
Implements stable embeddings with layer normalization
Optimized using cross-entropy loss with mean reduction

Core Capabilities

Multilingual text generation across 46+ languages
Code generation in 12 programming languages
Natural language understanding and generation
Transfer learning base for fine-tuning
Research and exploration of language model behaviors

Frequently Asked Questions

Q: What makes this model unique?

BLOOM-7B1 stands out for its extensive language coverage, including many low-resource languages, and its open-science development approach. It was trained using sustainable computing resources and is freely available for research purposes.

Q: What are the recommended use cases?

The model is best suited for research purposes, text generation tasks, and as a base model for fine-tuning. It's specifically designed for non-commercial use and should not be used for high-stakes decisions or critical applications.

bloom-7b1