ALLaM-7B-Instruct-preview

ALLaM-7B-Instruct-preview

ALLaM-AI

ALLaM-7B-Instruct is a bilingual Arabic-English LLM with 7B parameters, trained on 5.2T tokens and optimized for Arabic language tasks while maintaining English capabilities

PropertyValue
Parameter Count7 Billion
Context Length4096 tokens
Training Tokens5.2T (4T English + 1.2T Arabic/English)
DeveloperNational Center for Artificial Intelligence at SDAIA
Model TypeAutoregressive Transformer
LanguagesArabic, English

What is ALLaM-7B-Instruct-preview?

ALLaM-7B-Instruct-preview is a groundbreaking bilingual language model developed by the Saudi Data and AI Authority (SDAIA) specifically designed to advance Arabic Language Technology while maintaining strong English language capabilities. The model represents a significant step forward in bilingual AI, trained through a novel two-stage process that involves initial training on English followed by mixed Arabic/English content.

Implementation Details

The model employs a sophisticated training approach using NVIDIA/MegatronLM with bf16-mixed precision, achieving approximately 42% MFU during training. Its architecture is optimized to function without requiring a predefined system prompt, though it supports custom prompts in both Arabic and English.

  • Trained on 4T English tokens followed by 1.2T mixed Arabic/English tokens
  • Instruction-tuned with 7M instructions and 260K preference pairs
  • Supports 4096 token context length
  • Built using state-of-the-art autoregressive transformer architecture

Core Capabilities

  • Superior performance on Arabic language tasks, outperforming many existing models on Arabic benchmarks
  • Strong bilingual capabilities in both Arabic and English
  • Flexible system prompt support for customized interactions
  • Competitive performance on various evaluation metrics including MMLU, MT-bench, and Arabic-specific benchmarks

Frequently Asked Questions

Q: What makes this model unique?

ALLaM-7B-Instruct-preview stands out for its specialized focus on Arabic language processing while maintaining strong English capabilities, achieved through its innovative two-stage training process. It demonstrates superior performance on Arabic benchmarks while remaining competitive in English tasks.

Q: What are the recommended use cases?

The model is ideal for research and development in Arabic Language Technology, bilingual applications, and as a component in larger AI systems. It's particularly well-suited for tasks requiring strong understanding of both Arabic and English contexts, though developers should implement appropriate safety measures for production use.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026