TinyLlama v1.1

Property	Value
Parameters	1.1B
Training Tokens	2T
License	Apache 2.0
Paper	arXiv:2401.02385
Architecture	Llama 2

What is TinyLlama_v1.1?

TinyLlama_v1.1 is a compact yet powerful language model that adopts the Llama 2 architecture while maintaining a small 1.1B parameter footprint. It's trained on the SlimPajama dataset through a sophisticated three-stage training process, making it suitable for applications with restricted computational resources.

Implementation Details

The model underwent three distinct training phases: basic pretraining on SlimPajama (1.5T tokens), continual pretraining with specific domain focus, and a final cooldown phase with increased batch size from 1.8M to 7.2M. The training was conducted using 4 A100-40G GPUs per node with model weight sharding.

Identical architecture and tokenizer as Llama 2
Trained on 2T tokens total
Optimized for resource-efficient deployment
Available in three specialized variants (standard, Math&Code, Chinese)

Core Capabilities

Strong performance on benchmark tasks (HellaSwag: 61.47%, PIQA: 73.56%)
Efficient text generation and processing
Plug-and-play compatibility with Llama-based projects
Balanced performance across various reasoning tasks

Frequently Asked Questions

Q: What makes this model unique?

TinyLlama's uniqueness lies in its efficient architecture that maintains strong performance despite its compact size of 1.1B parameters, making it ideal for resource-constrained environments while maintaining compatibility with the Llama ecosystem.

Q: What are the recommended use cases?

The model is well-suited for general text generation tasks, particularly in scenarios where computational resources are limited. It's especially effective for applications requiring a balance between performance and resource efficiency.

TinyLlama_v1.1