whisper-base-ar-quran
Property | Value |
---|---|
License | Apache 2.0 |
Base Model | OpenAI Whisper Base |
Final WER | 5.75% |
Downloads | 3,650 |
What is whisper-base-ar-quran?
whisper-base-ar-quran is a specialized automatic speech recognition (ASR) model fine-tuned from OpenAI's Whisper base model, specifically optimized for Quranic Arabic recognition. Developed by tarteel-ai, this model demonstrates exceptional performance with a final Word Error Rate (WER) of 5.75%, showing significant improvement from its initial 13.39% WER during training.
Implementation Details
The model was trained using a distributed training setup across 8 GPUs with a total batch size of 128. Training utilized the Adam optimizer with carefully tuned hyperparameters (β1=0.9, β2=0.999, ε=1e-08) and implemented a linear learning rate scheduler with 500 warmup steps. The training process spanned 5,000 steps using mixed-precision training with Native AMP.
- Learning rate: 0.0001
- Training batch size: 16 per GPU (128 total)
- Evaluation batch size: 8 per GPU (64 total)
- Training steps: 5,000
- Framework: PyTorch 1.13.0
Core Capabilities
- Specialized in Quranic Arabic speech recognition
- Achieves 5.75% WER on evaluation set
- Supports TensorBoard integration for monitoring
- Provides inference endpoints for practical deployment
Frequently Asked Questions
Q: What makes this model unique?
The model's specialization in Quranic Arabic and its impressive WER improvement from 13.39% to 5.75% make it particularly valuable for religious audio content processing. The careful fine-tuning process and multi-GPU training setup demonstrate its optimization for accuracy and performance.
Q: What are the recommended use cases?
This model is ideal for transcribing Quranic recitations, religious lectures in Arabic, and other Arabic speech recognition tasks, particularly those involving classical or Quranic Arabic. It's especially suitable for applications requiring high accuracy in religious content processing.