whisper-base-ar-quran

Property	Value
License	Apache 2.0
Base Model	OpenAI Whisper Base
Final WER	5.75%
Downloads	3,650

What is whisper-base-ar-quran?

whisper-base-ar-quran is a specialized automatic speech recognition (ASR) model fine-tuned from OpenAI's Whisper base model, specifically optimized for Quranic Arabic recognition. Developed by tarteel-ai, this model demonstrates exceptional performance with a final Word Error Rate (WER) of 5.75%, showing significant improvement from its initial 13.39% WER during training.

Implementation Details

The model was trained using a distributed training setup across 8 GPUs with a total batch size of 128. Training utilized the Adam optimizer with carefully tuned hyperparameters (β1=0.9, β2=0.999, ε=1e-08) and implemented a linear learning rate scheduler with 500 warmup steps. The training process spanned 5,000 steps using mixed-precision training with Native AMP.

Learning rate: 0.0001
Training batch size: 16 per GPU (128 total)
Evaluation batch size: 8 per GPU (64 total)
Training steps: 5,000
Framework: PyTorch 1.13.0

Core Capabilities

Specialized in Quranic Arabic speech recognition
Achieves 5.75% WER on evaluation set
Supports TensorBoard integration for monitoring
Provides inference endpoints for practical deployment

Frequently Asked Questions

Q: What makes this model unique?

The model's specialization in Quranic Arabic and its impressive WER improvement from 13.39% to 5.75% make it particularly valuable for religious audio content processing. The careful fine-tuning process and multi-GPU training setup demonstrate its optimization for accuracy and performance.

Q: What are the recommended use cases?

This model is ideal for transcribing Quranic recitations, religious lectures in Arabic, and other Arabic speech recognition tasks, particularly those involving classical or Quranic Arabic. It's especially suitable for applications requiring high accuracy in religious content processing.