DeciLM-6b

Maintained By
Deci

DeciLM-6b

PropertyValue
Parameter Count5.7 Billion
Model TypeDecoder-only Language Model
ArchitectureTransformer with Variable GQA
Context Length4096 tokens
LicenseLlama 2 Community License
Training DataSlimPajama dataset

What is DeciLM-6b?

DeciLM-6b is a groundbreaking language model developed by Deci AI that combines high performance with remarkable efficiency. This 5.7B parameter model leverages an innovative variable Grouped-Query Attention mechanism, achieved through Deci's proprietary Neural Architecture Search technology (AutoNAC). The model demonstrates impressive benchmark results across multiple tasks while maintaining significantly higher throughput compared to similar-sized models.

Implementation Details

The model architecture features 32 layers with 32 attention heads and a hidden size of 4096. It implements Dynamic NTK Scaling Rotary Position Embeddings and variable GQA, optimized per layer for maximum efficiency. Performance benchmarks show throughput of up to 2,029.6 tokens/sec on A10 hardware using Infery LLM.

  • Variable Grouped-Query Attention for optimal computation efficiency
  • 4096 token context window
  • BF16 precision support
  • Optimized for both research and commercial applications

Core Capabilities

  • Strong performance on multiple benchmarks (ARC, HellaSwag, PIQA)
  • 74.58% accuracy on HellaSwag
  • 77.09% accuracy on PIQA
  • 71.01% accuracy on BoolQ
  • Up to 15x faster throughput compared to Llama 2 7B

Frequently Asked Questions

Q: What makes this model unique?

DeciLM-6b stands out for its variable Grouped-Query Attention mechanism, optimized through AutoNAC technology, delivering exceptional efficiency without compromising performance. The model achieves significantly higher throughput than comparable models while maintaining strong benchmark results.

Q: What are the recommended use cases?

The model is well-suited for both commercial and research applications in English language tasks. It can be fine-tuned for specific use cases and potentially adapted for other languages. Its high efficiency makes it particularly valuable for production environments where computational resources are a consideration.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.