AraModernBert-Base-V1.0
Property | Value |
---|---|
Parameters | ~149M |
Context Length | 8,192 tokens |
Architecture | ModernBERT |
Vocabulary Size | 50,280 tokens |
Model Type | Transformer (ModernBert) |
Developer | NAMAA-Space |
What is AraModernBert-Base-V1.0?
AraModernBert-Base-V1.0 is an advanced Arabic language model that combines the innovative ModernBERT architecture with specialized Arabic language processing capabilities. Trained on 100 GigaBytes of Arabic text, it features a custom tokenizer with 50,280 tokens and employs the novel Trans-tokenization technique for optimal embedding layer initialization.
Implementation Details
The model implements a sophisticated architecture with 22 transformer layers, each with 768 hidden dimensions. It utilizes an alternating attention mechanism, combining global attention every 3 layers with a local attention window of 128 tokens. The model employs Rotary Positional Embeddings (RoPE) with different theta values for global (160000.0) and local (10000.0) attention.
- 22 transformer layers with 768 hidden dimensions
- 12 attention heads
- 8,192 token context window
- Alternating attention mechanism
- Specialized Arabic vocabulary
Core Capabilities
- Text Classification (94.32% accuracy)
- Named Entity Recognition (90.39% accuracy)
- Semantic Textual Similarity (STS17: 0.831, STS22: 0.617)
- Information Retrieval
- RAG (Retrieval Augmented Generation)
- Document Similarity Analysis
Frequently Asked Questions
Q: What makes this model unique?
AraModernBert combines the advanced ModernBERT architecture with specialized Arabic language processing capabilities, featuring a unique Trans-tokenization approach and extensive training on Arabic text. Its alternating attention mechanism and large context window make it particularly effective for long-form Arabic text processing.
Q: What are the recommended use cases?
The model excels in tasks including text classification, named entity recognition, and semantic similarity analysis. It's particularly well-suited for Modern Standard Arabic text processing, though performance may vary with dialectal Arabic variants.