Hymba-1.5B-Instruct

Maintained By
nvidia

Hymba-1.5B-Instruct

PropertyValue
Parameter Count1.52B
Model TypeInstruction-tuned Language Model
LicenseNVIDIA Open Model License
PaperarXiv:2411.13676
Tensor TypeBF16

What is Hymba-1.5B-Instruct?

Hymba-1.5B-Instruct is an advanced language model developed by NVIDIA that represents a significant innovation in hybrid architecture design. Built upon the Hymba-1.5B-Base model, it combines traditional attention mechanisms with State Space Model (SSM) heads to create a unique and efficient architecture optimized for instruction-following tasks.

Implementation Details

The model features a sophisticated architecture with 1600 embedding dimensions, 25 attention heads, and 32 layers in total. Its unique hybrid design includes 16 SSM states and 3 full attention layers, with the remaining layers implementing sliding window attention. The architecture employs Grouped-Query Attention (GQA) and Rotary Position Embeddings (RoPE) for enhanced performance.

  • Parallel processing through fused attention and SSM heads
  • Meta tokens for efficient information storage
  • Cross-layer KV sharing for improved memory efficiency
  • Global-local attention mechanisms

Core Capabilities

  • Advanced math reasoning abilities
  • Function calling support
  • Role-playing capabilities
  • Efficient processing with sliding window attention
  • Commercial-ready deployment options

Frequently Asked Questions

Q: What makes this model unique?

The model's hybrid architecture combining attention heads and SSM heads within the same layer sets it apart, offering parallel and complementary processing capabilities that enhance its performance across various tasks.

Q: What are the recommended use cases?

Hymba-1.5B-Instruct is particularly well-suited for applications requiring mathematical reasoning, function calling, and interactive conversations. It's designed for commercial use and can be effectively deployed in production environments.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.