wav2vec2-xls-r-300m-wolof-lm

Maintained By
abdouaziiz

wav2vec2-xls-r-300m-wolof-lm

PropertyValue
Base Modelfacebook/wav2vec2-xls-r-300m
Training Data16.8 hours (10,000 audio files)
Best WER21.26%
Authorabdouaziiz
Model HubHugging Face

What is wav2vec2-xls-r-300m-wolof-lm?

This is a specialized speech recognition model designed for the Wolof language, primarily spoken in Senegal and neighboring countries. It represents a significant advancement in low-resource language processing, built by fine-tuning the powerful XLS-R 300M model with a custom language model trained on the ALFFA_PUBLIC dataset.

Implementation Details

The model was trained using carefully selected hyperparameters including a learning rate of 1e-4, Adam optimizer, and linear learning rate scheduling. Training was conducted over 10 epochs with regular evaluation every 1500 steps, showing consistent improvement in Word Error Rate (WER) from 54.39% to 21.26%.

  • Training batch size: 3 with total batch size of 64
  • Evaluation batch size: 8 with total batch size of 64
  • Warmup steps: 1000
  • Training dataset: 10,000 audio files
  • Test dataset: 3,339 audio files

Core Capabilities

  • Automatic Speech Recognition (ASR) for Wolof language
  • Integration with Hugging Face's Transformers library
  • Customizable preprocessing and inference pipeline
  • Support for 16kHz audio input
  • Language model enhancement for improved accuracy

Frequently Asked Questions

Q: What makes this model unique?

This model addresses the significant gap in speech recognition technology for the Wolof language, providing a practical solution for a traditionally under-resourced language. Its achievement of 21.26% WER makes it a valuable tool for Wolof speech processing tasks.

Q: What are the recommended use cases?

The model is ideal for transcribing Wolof speech in various applications including: speech-to-text services, language documentation efforts, accessibility tools, and academic research in West African languages. The model can be further improved through spell checking and additional language model integration.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.