Hausa_xlsr

Maintained By
Akashpb13

Hausa_xlsr

PropertyValue
Base Modelfacebook/wav2vec2-xls-r-300m
AuthorAkashpb13
FrameworkPyTorch 1.10.0
Evaluation WER32.99%

What is Hausa_xlsr?

Hausa_xlsr is a specialized speech recognition model fine-tuned from Facebook's wav2vec2-xls-r-300m for the Hausa language. The model represents a significant advancement in African language processing, achieving a Word Error Rate (WER) of 32.99% on its evaluation set.

Implementation Details

The model was trained using a comprehensive dataset from Common Voice 7.0, including train.tsv, dev.tsv, invalidated.tsv, reported.tsv, and other.tsv files. Training utilized native AMP mixed precision training over 50 epochs, with a cosine learning rate scheduler with warmup steps.

  • Learning rate: 0.000096
  • Batch size: 16 (training and evaluation)
  • Gradient accumulation steps: 2
  • Warmup steps: 500

Core Capabilities

  • Hausa speech recognition with 32.99% WER
  • Robust performance across various Common Voice datasets
  • Optimized for production deployment with PyTorch
  • Supports mixed precision training

Frequently Asked Questions

Q: What makes this model unique?

The model specializes in Hausa language speech recognition, utilizing a carefully curated dataset where only entries with more upvotes than downvotes were considered. Its training progression shows significant improvement from an initial WER of 100% to a final 32.99%.

Q: What are the recommended use cases?

This model is ideal for Hausa speech recognition tasks, particularly in applications requiring transcription of Hausa audio content. It's best suited for scenarios where the audio quality matches the Common Voice dataset characteristics.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.