BLOOM Intermediate Model

Property	Value
Parameters	176 Billion
License	RAIL License v1.0
Developer	BigScience
Languages	45 natural languages + 12 programming languages

What is bloom-intermediate?

BLOOM-intermediate represents the checkpoint versions of the massive BLOOM language model, offering researchers and developers access to various training stages of this groundbreaking multilingual model. These checkpoints, ranging from steps 5000 to 93000, provide valuable insights into the model's training progression and development.

Implementation Details

The model features a decoder-only architecture with 70 layers and 112 attention heads. It implements several advanced techniques including ALiBI positional encodings and StableEmbedding layer normalization. Training was conducted on the Jean Zay supercomputer using 384 A100 80GB GPUs.

Hidden layer dimension: 14336
Sequence length: 2048 tokens
Vocabulary size: 250,680 tokens
Training infrastructure: Megatron-DeepSpeed with PyTorch

Core Capabilities

Multilingual text generation across 45 natural languages
Code generation in 12 programming languages
Base model for fine-tuning on specific tasks
Research-oriented applications in NLP

Frequently Asked Questions

Q: What makes this model unique?

BLOOM-intermediate is unique in providing access to training checkpoints of one of the largest open-source multilingual models, allowing researchers to study model evolution during training. It's specifically designed for public research and non-commercial applications.

Q: What are the recommended use cases?

The model is recommended for research purposes, including studying model training dynamics, exploring language model behavior, and developing downstream applications in areas like information extraction and question answering. However, it should not be used for high-stakes decisions or critical applications.