hubert-large-arabic-egyptian
Property | Value |
---|---|
Parameter Count | 315M |
License | cc-by-nc-4.0 |
Paper | View Paper |
WER (Test) | 25.9% |
What is hubert-large-arabic-egyptian?
This is a state-of-the-art automatic speech recognition (ASR) model specifically designed for Egyptian Arabic. It's based on the HuBERT architecture and has been fine-tuned on both the MGB-3 dataset and Egyptian Arabic Conversational Speech Corpus. The model represents a significant advancement in Egyptian Arabic speech recognition, achieving a Word Error Rate (WER) of 25.9% on test data.
Implementation Details
The model builds upon the original Arabic HuBERT-Large architecture, utilizing a combination of CTC and Attention mechanisms. It's optimized for 16kHz sampled audio input and employs transformer-based architecture with PyTorch backend.
- 315M trainable parameters
- Uses F32 tensor type
- Implements CTC and Attention mechanisms
- Trained on MGB-3 and Egyptian Arabic Conversational Speech datasets
Core Capabilities
- State-of-the-art Egyptian Arabic speech recognition
- Handles conversational speech effectively
- 23.5% WER on validation set
- 25.9% WER on test set
Frequently Asked Questions
Q: What makes this model unique?
This model is specifically optimized for Egyptian Arabic, achieving state-of-the-art performance through its fine-tuning on both MGB-3 and Egyptian Arabic Conversational Speech Corpus. It's particularly notable for handling the nuances of Egyptian dialect.
Q: What are the recommended use cases?
The model is ideal for transcribing Egyptian Arabic speech in various applications, including conversational AI, content transcription, and speech analytics. It works best with 16kHz audio input and is particularly suited for real-world Egyptian Arabic speech recognition tasks.