Dorna-Llama3-8B-Instruct

Property	Value
Parameter Count	8 Billion
Model Type	Decoder-only Language Model
Architecture	Based on Meta Llama 3
Developer	PartAI
Model Hub	Hugging Face

What is Dorna-Llama3-8B-Instruct?

Dorna-Llama3-8B-Instruct is a specialized Persian language model that builds upon Meta's Llama 3 architecture. It represents part of the Dorna family of decoder-only models, specifically optimized for Persian language understanding and generation. The model demonstrates impressive performance metrics, particularly in comparison to other Persian language models and even holds its ground against GPT-3.5-turbo-1106.

Implementation Details

The model is implemented using the Transformers library and supports efficient inference with bfloat16 precision and automatic device mapping. It can be easily integrated into applications using the standard Hugging Face pipeline, with support for chat-based interactions through a structured message format.

Supports chat template formatting for conversation-style interactions
Implements temperature and top-p sampling for controlled generation
Optimized for both Persian and multilingual responses
Efficient inference with bfloat16 precision

Core Capabilities

Boolean question handling (both simple and complex)
Code generation with 60% win rate against Llama 3
Long-form response generation
Mathematical problem solving
News-based question answering
Text paraphrasing
General knowledge response generation
Text summarization

Frequently Asked Questions

Q: What makes this model unique?

The model shows remarkable performance in Persian language tasks, outperforming many existing models including Persian Mind (with 55.77% win rate) and achieving competitive results against GPT-3.5-turbo-1106. It particularly excels in code generation, long-form responses, and complex knowledge tasks.

Q: What are the recommended use cases?

The model is well-suited for a wide range of applications including Persian language text generation, question answering, code generation, and summarization tasks. It performs particularly well in scenarios requiring detailed knowledge and complex reasoning, making it suitable for both academic and commercial applications.