wav2vec2-xls-r-1b-portuguese

Maintained By
jonatasgrosman

wav2vec2-xls-r-1b-portuguese

PropertyValue
LicenseApache 2.0
Authorjonatasgrosman
Downloads359,756
Base ArchitectureXLS-R Wav2Vec2

What is wav2vec2-xls-r-1b-portuguese?

This is a state-of-the-art speech recognition model specifically fine-tuned for Portuguese language processing. Built on Facebook's wav2vec2-xls-r-1b architecture, it has been optimized using multiple high-quality datasets including Common Voice 8.0, CORAA, Multilingual TEDx, and Multilingual LibriSpeech. The model demonstrates impressive performance with a Word Error Rate (WER) of 8.7%, which improves to 6.04% when combined with a Language Model.

Implementation Details

The model operates on 16kHz audio input and leverages the powerful XLS-R architecture for acoustic modeling. It has been trained using the HuggingSound tool and is optimized for Portuguese speech recognition tasks.

  • Supports both standard inference and language model enhanced transcription
  • Achieved 2.55% Character Error Rate (CER) on test data
  • Performs well on challenging scenarios with 18.8% WER on Robust Speech Event test data

Core Capabilities

  • High-accuracy Portuguese speech recognition
  • Batch processing of audio files
  • Support for various audio formats
  • Easy integration with both HuggingSound and custom inference scripts

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its comprehensive training on diverse Portuguese speech datasets and impressive error rates, making it particularly robust for real-world applications. The inclusion of language model enhancement options provides flexibility for different use cases.

Q: What are the recommended use cases?

The model is ideal for Portuguese speech transcription tasks, particularly in scenarios requiring high accuracy. It's suitable for applications like automated transcription services, subtitle generation, and voice command systems for Portuguese speakers.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.