asr-wav2vec2-dvoice-wolof

Maintained By
speechbrain

ASR Wav2vec2 DVoice Wolof

PropertyValue
LicenseApache 2.0
FrameworkSpeechBrain + PyTorch
Test WER16.05%
Test CER4.83%

What is asr-wav2vec2-dvoice-wolof?

This is an automatic speech recognition (ASR) model specifically designed for the Wolof language, built using the wav2vec 2.0 architecture and trained on the DVoice Wolof dataset. The model represents a significant advancement in African language technology, combining transfer learning from Facebook's wav2vec2-large-xlsr-53 with specialized training for Wolof speech recognition.

Implementation Details

The model employs a two-block architecture consisting of a unigram tokenizer for subword unit transformation and an acoustic model based on wav2vec2.0 with CTC decoding. It's built on the SpeechBrain framework and processes 16kHz single-channel audio input.

  • Pretrained wav2vec 2.0 base (facebook/wav2vec2-large-xlsr-53)
  • CTC decoding with greedy search
  • Automatic audio normalization capabilities
  • Support for GPU inference

Core Capabilities

  • Achieves 16.05% Word Error Rate on test set
  • Character Error Rate of 4.83% on test data
  • Handles 16kHz audio input with automatic resampling
  • Supports real-time transcription of Wolof speech

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically designed for Wolof, a low-resource African language, and is part of the DVoice initiative to improve voice technology accessibility for African languages. It achieves competitive performance metrics while requiring minimal preprocessing of input audio.

Q: What are the recommended use cases?

The model is ideal for Wolof speech transcription tasks, particularly in applications requiring automatic subtitling, voice command systems, or speech-to-text services for Wolof speakers. It's especially valuable for organizations working with Wolof-speaking communities or developing language technology solutions for West Africa.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.