Hausa_xlsr
Property | Value |
---|---|
Base Model | facebook/wav2vec2-xls-r-300m |
Author | Akashpb13 |
Framework | PyTorch 1.10.0 |
Evaluation WER | 32.99% |
What is Hausa_xlsr?
Hausa_xlsr is a specialized speech recognition model fine-tuned from Facebook's wav2vec2-xls-r-300m for the Hausa language. The model represents a significant advancement in African language processing, achieving a Word Error Rate (WER) of 32.99% on its evaluation set.
Implementation Details
The model was trained using a comprehensive dataset from Common Voice 7.0, including train.tsv, dev.tsv, invalidated.tsv, reported.tsv, and other.tsv files. Training utilized native AMP mixed precision training over 50 epochs, with a cosine learning rate scheduler with warmup steps.
- Learning rate: 0.000096
- Batch size: 16 (training and evaluation)
- Gradient accumulation steps: 2
- Warmup steps: 500
Core Capabilities
- Hausa speech recognition with 32.99% WER
- Robust performance across various Common Voice datasets
- Optimized for production deployment with PyTorch
- Supports mixed precision training
Frequently Asked Questions
Q: What makes this model unique?
The model specializes in Hausa language speech recognition, utilizing a carefully curated dataset where only entries with more upvotes than downvotes were considered. Its training progression shows significant improvement from an initial WER of 100% to a final 32.99%.
Q: What are the recommended use cases?
This model is ideal for Hausa speech recognition tasks, particularly in applications requiring transcription of Hausa audio content. It's best suited for scenarios where the audio quality matches the Common Voice dataset characteristics.