TinyLlama v1.1
Property | Value |
---|---|
Parameters | 1.1B |
Training Tokens | 2T |
License | Apache 2.0 |
Paper | arXiv:2401.02385 |
Architecture | Llama 2 |
What is TinyLlama_v1.1?
TinyLlama_v1.1 is a compact yet powerful language model that adopts the Llama 2 architecture while maintaining a small 1.1B parameter footprint. It's trained on the SlimPajama dataset through a sophisticated three-stage training process, making it suitable for applications with restricted computational resources.
Implementation Details
The model underwent three distinct training phases: basic pretraining on SlimPajama (1.5T tokens), continual pretraining with specific domain focus, and a final cooldown phase with increased batch size from 1.8M to 7.2M. The training was conducted using 4 A100-40G GPUs per node with model weight sharding.
- Identical architecture and tokenizer as Llama 2
- Trained on 2T tokens total
- Optimized for resource-efficient deployment
- Available in three specialized variants (standard, Math&Code, Chinese)
Core Capabilities
- Strong performance on benchmark tasks (HellaSwag: 61.47%, PIQA: 73.56%)
- Efficient text generation and processing
- Plug-and-play compatibility with Llama-based projects
- Balanced performance across various reasoning tasks
Frequently Asked Questions
Q: What makes this model unique?
TinyLlama's uniqueness lies in its efficient architecture that maintains strong performance despite its compact size of 1.1B parameters, making it ideal for resource-constrained environments while maintaining compatibility with the Llama ecosystem.
Q: What are the recommended use cases?
The model is well-suited for general text generation tasks, particularly in scenarios where computational resources are limited. It's especially effective for applications requiring a balance between performance and resource efficiency.