SambaLingo-Arabic-Chat
Property | Value |
---|---|
Parameter Count | 6.95B |
Model Type | Language Model (Chat) |
Base Model | Llama-2-7B |
License | Llama 2 License |
Paper | SambaLingo: Teaching Large Language Models New Languages |
What is SambaLingo-Arabic-Chat?
SambaLingo-Arabic-Chat is an advanced bilingual language model developed by SambaNova Systems, specifically designed to handle both Arabic and English conversations. Built upon Llama-2-7B, it has been extensively trained on 63 billion tokens from the Arabic portion of the Cultura-X dataset and further refined using direct preference optimization techniques.
Implementation Details
The model implements a two-stage training approach: Supervised Fine-Tuning (SFT) using the ultrachat_200k dataset and Direct Preference Optimization (DPO) using ultrafeedback and cai-conversation-harmless datasets. The model's vocabulary has been expanded from 32,000 to 57,000 tokens to better accommodate Arabic language features.
- Extended vocabulary optimization for Arabic language support
- BF16 tensor type for efficient processing
- Implements chat template system for structured interactions
- Supports both Arabic and English inputs with seamless processing
Core Capabilities
- Bilingual conversation handling in Arabic and English
- Human-aligned responses with preference optimization
- Context-aware text generation
- Customizable inference parameters for different use cases
- Integrated safety measures and ethical considerations
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its specialized optimization for Arabic-English bilingual conversations, combined with extensive training on Arabic cultural context and human preference alignment through DPO techniques.
Q: What are the recommended use cases?
The model is best suited for conversational AI applications requiring Arabic-English bilingual capabilities, cultural understanding, and aligned responses. However, it should not be used for mission-critical applications or safety-critical decisions.