wav2vec2-xls-r-300m-wolof-lm
Property | Value |
---|---|
Base Model | facebook/wav2vec2-xls-r-300m |
Training Data | 16.8 hours (10,000 audio files) |
Best WER | 21.26% |
Author | abdouaziiz |
Model Hub | Hugging Face |
What is wav2vec2-xls-r-300m-wolof-lm?
This is a specialized speech recognition model designed for the Wolof language, primarily spoken in Senegal and neighboring countries. It represents a significant advancement in low-resource language processing, built by fine-tuning the powerful XLS-R 300M model with a custom language model trained on the ALFFA_PUBLIC dataset.
Implementation Details
The model was trained using carefully selected hyperparameters including a learning rate of 1e-4, Adam optimizer, and linear learning rate scheduling. Training was conducted over 10 epochs with regular evaluation every 1500 steps, showing consistent improvement in Word Error Rate (WER) from 54.39% to 21.26%.
- Training batch size: 3 with total batch size of 64
- Evaluation batch size: 8 with total batch size of 64
- Warmup steps: 1000
- Training dataset: 10,000 audio files
- Test dataset: 3,339 audio files
Core Capabilities
- Automatic Speech Recognition (ASR) for Wolof language
- Integration with Hugging Face's Transformers library
- Customizable preprocessing and inference pipeline
- Support for 16kHz audio input
- Language model enhancement for improved accuracy
Frequently Asked Questions
Q: What makes this model unique?
This model addresses the significant gap in speech recognition technology for the Wolof language, providing a practical solution for a traditionally under-resourced language. Its achievement of 21.26% WER makes it a valuable tool for Wolof speech processing tasks.
Q: What are the recommended use cases?
The model is ideal for transcribing Wolof speech in various applications including: speech-to-text services, language documentation efforts, accessibility tools, and academic research in West African languages. The model can be further improved through spell checking and additional language model integration.