wav2vec2-xls-r-300m-wolof-lm

wav2vec2-xls-r-300m-wolof-lm

abdouaziiz

Speech recognition model for Wolof language, fine-tuned from XLS-R 300M with language modeling. Achieves 21.26% WER on evaluation data.

PropertyValue
Base Modelfacebook/wav2vec2-xls-r-300m
Training Data16.8 hours (10,000 audio files)
Best WER21.26%
Authorabdouaziiz
Model HubHugging Face

What is wav2vec2-xls-r-300m-wolof-lm?

This is a specialized speech recognition model designed for the Wolof language, primarily spoken in Senegal and neighboring countries. It represents a significant advancement in low-resource language processing, built by fine-tuning the powerful XLS-R 300M model with a custom language model trained on the ALFFA_PUBLIC dataset.

Implementation Details

The model was trained using carefully selected hyperparameters including a learning rate of 1e-4, Adam optimizer, and linear learning rate scheduling. Training was conducted over 10 epochs with regular evaluation every 1500 steps, showing consistent improvement in Word Error Rate (WER) from 54.39% to 21.26%.

  • Training batch size: 3 with total batch size of 64
  • Evaluation batch size: 8 with total batch size of 64
  • Warmup steps: 1000
  • Training dataset: 10,000 audio files
  • Test dataset: 3,339 audio files

Core Capabilities

  • Automatic Speech Recognition (ASR) for Wolof language
  • Integration with Hugging Face's Transformers library
  • Customizable preprocessing and inference pipeline
  • Support for 16kHz audio input
  • Language model enhancement for improved accuracy

Frequently Asked Questions

Q: What makes this model unique?

This model addresses the significant gap in speech recognition technology for the Wolof language, providing a practical solution for a traditionally under-resourced language. Its achievement of 21.26% WER makes it a valuable tool for Wolof speech processing tasks.

Q: What are the recommended use cases?

The model is ideal for transcribing Wolof speech in various applications including: speech-to-text services, language documentation efforts, accessibility tools, and academic research in West African languages. The model can be further improved through spell checking and additional language model integration.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026