Dorna-Llama3-8B-Instruct
Property | Value |
---|---|
Parameter Count | 8 Billion |
Model Type | Decoder-only Language Model |
Architecture | Based on Meta Llama 3 |
Developer | PartAI |
Model Hub | Hugging Face |
What is Dorna-Llama3-8B-Instruct?
Dorna-Llama3-8B-Instruct is a specialized Persian language model that builds upon Meta's Llama 3 architecture. It represents part of the Dorna family of decoder-only models, specifically optimized for Persian language understanding and generation. The model demonstrates impressive performance metrics, particularly in comparison to other Persian language models and even holds its ground against GPT-3.5-turbo-1106.
Implementation Details
The model is implemented using the Transformers library and supports efficient inference with bfloat16 precision and automatic device mapping. It can be easily integrated into applications using the standard Hugging Face pipeline, with support for chat-based interactions through a structured message format.
- Supports chat template formatting for conversation-style interactions
- Implements temperature and top-p sampling for controlled generation
- Optimized for both Persian and multilingual responses
- Efficient inference with bfloat16 precision
Core Capabilities
- Boolean question handling (both simple and complex)
- Code generation with 60% win rate against Llama 3
- Long-form response generation
- Mathematical problem solving
- News-based question answering
- Text paraphrasing
- General knowledge response generation
- Text summarization
Frequently Asked Questions
Q: What makes this model unique?
The model shows remarkable performance in Persian language tasks, outperforming many existing models including Persian Mind (with 55.77% win rate) and achieving competitive results against GPT-3.5-turbo-1106. It particularly excels in code generation, long-form responses, and complex knowledge tasks.
Q: What are the recommended use cases?
The model is well-suited for a wide range of applications including Persian language text generation, question answering, code generation, and summarization tasks. It performs particularly well in scenarios requiring detailed knowledge and complex reasoning, making it suitable for both academic and commercial applications.