Hausa_xlsr

Property	Value
Base Model	facebook/wav2vec2-xls-r-300m
Author	Akashpb13
Framework	PyTorch 1.10.0
Evaluation WER	32.99%

What is Hausa_xlsr?

Hausa_xlsr is a specialized speech recognition model fine-tuned from Facebook's wav2vec2-xls-r-300m for the Hausa language. The model represents a significant advancement in African language processing, achieving a Word Error Rate (WER) of 32.99% on its evaluation set.

Implementation Details

The model was trained using a comprehensive dataset from Common Voice 7.0, including train.tsv, dev.tsv, invalidated.tsv, reported.tsv, and other.tsv files. Training utilized native AMP mixed precision training over 50 epochs, with a cosine learning rate scheduler with warmup steps.

Learning rate: 0.000096
Batch size: 16 (training and evaluation)
Gradient accumulation steps: 2
Warmup steps: 500

Core Capabilities

Hausa speech recognition with 32.99% WER
Robust performance across various Common Voice datasets
Optimized for production deployment with PyTorch
Supports mixed precision training

Frequently Asked Questions

Q: What makes this model unique?

The model specializes in Hausa language speech recognition, utilizing a carefully curated dataset where only entries with more upvotes than downvotes were considered. Its training progression shows significant improvement from an initial WER of 100% to a final 32.99%.

Q: What are the recommended use cases?

This model is ideal for Hausa speech recognition tasks, particularly in applications requiring transcription of Hausa audio content. It's best suited for scenarios where the audio quality matches the Common Voice dataset characteristics.

Hausa_xlsr

Hausa_xlsr

What is Hausa_xlsr?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models