wav2vec2-large-xls-r-300m-ha-cv8

Maintained By
anuragshas

wav2vec2-large-xls-r-300m-ha-cv8

PropertyValue
LicenseApache 2.0
Training DatasetCommon Voice 8 (Hausa)
Best WER36.295% (with LM)
FrameworkPyTorch 1.10.0

What is wav2vec2-large-xls-r-300m-ha-cv8?

This model is a specialized speech recognition system fine-tuned for the Hausa language, based on Facebook's wav2vec2-xls-r-300m architecture. It represents a significant advancement in African language processing, achieving a Word Error Rate (WER) of 36.295% with language model integration.

Implementation Details

The model was trained using a sophisticated approach with 100 epochs, utilizing the Adam optimizer and a cosine learning rate scheduler with warmup steps. Training was conducted with a batch size of 32 and gradient accumulation steps of 2, demonstrating robust optimization strategies.

  • Learning rate: 0.0001 with cosine restart scheduling
  • Warmup steps: 1000
  • Evaluation metrics: WER (36.295%) and CER (11.073%)
  • Training framework: Transformers 4.16.1 with PyTorch

Core Capabilities

  • Automatic Speech Recognition specifically for Hausa language
  • Supports both regular and language model-enhanced inference
  • Handles audio resampling from 48kHz to 16kHz
  • Efficient batch processing with CTC-based architecture

Frequently Asked Questions

Q: What makes this model unique?

This model specializes in Hausa language speech recognition, built on the powerful XLS-R architecture, making it one of the few high-performance ASR models for African languages. Its performance improves significantly with language model integration, reducing WER from 47.821% to 36.295%.

Q: What are the recommended use cases?

The model is ideal for Hausa speech transcription tasks, particularly in applications requiring automated transcription of Hausa audio content. It's suitable for both academic research and practical applications in speech processing for Hausa-speaking communities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.