AI21-Jamba-Mini-1.5

Maintained By
ai21labs

AI21-Jamba-Mini-1.5

PropertyValue
Parameters12B active / 52B total
Context Length256K tokens
ArchitectureHybrid SSM-Transformer (Jamba)
LicenseJamba Open Model License
Knowledge CutoffMarch 5, 2024

What is AI21-Jamba-Mini-1.5?

AI21-Jamba-Mini-1.5 is a groundbreaking hybrid SSM-Transformer model that combines state-of-the-art performance with exceptional efficiency. As part of the Jamba 1.5 family, it delivers up to 2.5X faster inference than comparable models while maintaining high-quality outputs across various tasks.

Implementation Details

The model features a unique architecture that successfully integrates non-Transformer components at scale, supporting an impressive 256K context length. It can be deployed using various configurations, from full precision on multiple GPUs to quantized versions that can run on a single GPU.

  • Supports multiple deployment options including vLLM and transformers library
  • ExpertsInt8 quantization enables running on a single 80GB GPU
  • Optimized for business use cases with function calling and structured output capabilities

Core Capabilities

  • Multilingual support for 9 languages including English, Spanish, French, and Arabic
  • Strong performance on benchmarks like MMLU (69.7%) and GSM-8K (75.8%)
  • Tool use and grounded generation support
  • JSON mode for structured output generation
  • Fine-tuning support through LoRA and QLoRA

Frequently Asked Questions

Q: What makes this model unique?

The model combines traditional Transformer architecture with State Space Model (SSM) components, offering superior long-context handling and inference speed while maintaining high-quality outputs. It's specifically optimized for business applications and supports extensive context lengths up to 256K tokens.

Q: What are the recommended use cases?

The model excels in business applications requiring structured output, function calling, and long-context understanding. It's particularly well-suited for RAG applications, multilingual tasks, and scenarios requiring tool use or JSON output formatting.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.