Zamba2-1.2B

Maintained By
Zyphra

Zamba2-1.2B

PropertyValue
Parameter Count1.22B
Model TypeHybrid SSM-Transformer
LicenseApache 2.0
Training Data3T tokens + 100B high-quality tokens
PaperZamba Architecture Paper

What is Zamba2-1.2B?

Zamba2-1.2B is a cutting-edge hybrid language model that combines state-space modeling (Mamba) with transformer architecture. It represents a significant advancement in AI model design, featuring a unique architecture that delivers high performance with minimal computational overhead. The model was trained on 3 trillion tokens of text and code data, followed by fine-tuning on 100B high-quality tokens.

Implementation Details

The model employs a sophisticated architecture combining Mamba2 blocks with shared transformer layers, enhanced by several key innovations:

  • Integration of Mamba2 blocks in a hybrid architecture
  • LoRA projectors for depth-specialized transformer layers
  • Rotary position embeddings in shared attention layers
  • Mistral v0.1 tokenizer implementation

Core Capabilities

  • State-of-the-art performance among models under 2B parameters
  • Extremely low inference latency and rapid generation
  • Significantly smaller memory footprint compared to traditional transformers
  • Efficient on-device deployment capabilities

Frequently Asked Questions

Q: What makes this model unique?

The model's hybrid architecture combining Mamba2 blocks with transformer layers, along with its innovative use of LoRA projectors and rotary position embeddings, enables exceptional performance while maintaining efficiency in both computation and memory usage.

Q: What are the recommended use cases?

Zamba2-1.2B is ideal for general-purpose text generation tasks, particularly in scenarios requiring on-device deployment or where computational resources are limited. However, it's important to note that the model is not fine-tuned for instruction following or chat applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.