DeciLM-7B-instruct
Property | Value |
---|---|
Parameter Count | 7.04B |
License | Apache 2.0 |
Language | English |
Architecture | Transformer Decoder with Variable GQA |
Context Length | 8192 tokens |
What is DeciLM-7B-instruct?
DeciLM-7B-instruct is an advanced language model developed by Deci, specifically designed for short-form instruction following. Built upon the foundation of DeciLM-7B, this model has been fine-tuned using LoRA on the SlimOrca dataset, achieving impressive performance without relying on complex preference optimization techniques like RLHF or DPO.
Implementation Details
The model features a sophisticated architecture with 32 layers and 32 attention heads, incorporating an optimized transformer decoder with variable Grouped-Query Attention (GQA). The implementation leverages AutoNAC technology to optimize GQA configurations per layer, resulting in enhanced efficiency and performance.
- Optimized transformer architecture with variable GQA
- 32 layers and 32 attention heads
- 8192 token context length
- BF16 tensor type support
Core Capabilities
- High-efficiency text generation
- Strong instruction-following abilities
- Impressive benchmark performance (63.19% average across standard benchmarks)
- Efficient inference with up to 4559 tokens/sec on A100 hardware
Frequently Asked Questions
Q: What makes this model unique?
DeciLM-7B-instruct stands out for its optimized architecture using variable Grouped-Query Attention and AutoNAC technology, achieving strong performance without complex optimization techniques like RLHF. It offers an excellent balance of efficiency and accuracy in the 7B parameter range.
Q: What are the recommended use cases?
The model is well-suited for commercial and research applications in English, particularly excelling in short-form instruction following tasks. It's ideal for applications requiring efficient text generation with high accuracy and reasonable computational requirements.