DeciLM-7B-instruct

Property	Value
Parameter Count	7.04B
License	Apache 2.0
Language	English
Architecture	Transformer Decoder with Variable GQA
Context Length	8192 tokens

What is DeciLM-7B-instruct?

DeciLM-7B-instruct is an advanced language model developed by Deci, specifically designed for short-form instruction following. Built upon the foundation of DeciLM-7B, this model has been fine-tuned using LoRA on the SlimOrca dataset, achieving impressive performance without relying on complex preference optimization techniques like RLHF or DPO.

Implementation Details

The model features a sophisticated architecture with 32 layers and 32 attention heads, incorporating an optimized transformer decoder with variable Grouped-Query Attention (GQA). The implementation leverages AutoNAC technology to optimize GQA configurations per layer, resulting in enhanced efficiency and performance.

Optimized transformer architecture with variable GQA
32 layers and 32 attention heads
8192 token context length
BF16 tensor type support

Core Capabilities

High-efficiency text generation
Strong instruction-following abilities
Impressive benchmark performance (63.19% average across standard benchmarks)
Efficient inference with up to 4559 tokens/sec on A100 hardware

Frequently Asked Questions

Q: What makes this model unique?

DeciLM-7B-instruct stands out for its optimized architecture using variable Grouped-Query Attention and AutoNAC technology, achieving strong performance without complex optimization techniques like RLHF. It offers an excellent balance of efficiency and accuracy in the 7B parameter range.

Q: What are the recommended use cases?

The model is well-suited for commercial and research applications in English, particularly excelling in short-form instruction following tasks. It's ideal for applications requiring efficient text generation with high accuracy and reasonable computational requirements.