Hymba-1.5B-Instruct
Property | Value |
---|---|
Parameter Count | 1.52B |
Model Type | Instruction-tuned Language Model |
License | NVIDIA Open Model License |
Paper | arXiv:2411.13676 |
Tensor Type | BF16 |
What is Hymba-1.5B-Instruct?
Hymba-1.5B-Instruct is an advanced language model developed by NVIDIA that represents a significant innovation in hybrid architecture design. Built upon the Hymba-1.5B-Base model, it combines traditional attention mechanisms with State Space Model (SSM) heads to create a unique and efficient architecture optimized for instruction-following tasks.
Implementation Details
The model features a sophisticated architecture with 1600 embedding dimensions, 25 attention heads, and 32 layers in total. Its unique hybrid design includes 16 SSM states and 3 full attention layers, with the remaining layers implementing sliding window attention. The architecture employs Grouped-Query Attention (GQA) and Rotary Position Embeddings (RoPE) for enhanced performance.
- Parallel processing through fused attention and SSM heads
- Meta tokens for efficient information storage
- Cross-layer KV sharing for improved memory efficiency
- Global-local attention mechanisms
Core Capabilities
- Advanced math reasoning abilities
- Function calling support
- Role-playing capabilities
- Efficient processing with sliding window attention
- Commercial-ready deployment options
Frequently Asked Questions
Q: What makes this model unique?
The model's hybrid architecture combining attention heads and SSM heads within the same layer sets it apart, offering parallel and complementary processing capabilities that enhance its performance across various tasks.
Q: What are the recommended use cases?
Hymba-1.5B-Instruct is particularly well-suited for applications requiring mathematical reasoning, function calling, and interactive conversations. It's designed for commercial use and can be effectively deployed in production environments.