wav2vec2-large-xlsr-catala

Maintained By
softcatala

wav2vec2-large-xlsr-catala

PropertyValue
LicenseApache 2.0
Downloads40,954
Primary TaskAutomatic Speech Recognition
LanguageCatalan

What is wav2vec2-large-xlsr-catala?

wav2vec2-large-xlsr-catala is a fine-tuned speech recognition model specifically optimized for the Catalan language. Based on Facebook's wav2vec2-large-xlsr-53 architecture, it has been trained on both the Common Voice and ParlamentParla datasets to provide accurate speech-to-text capabilities for Catalan speakers.

Implementation Details

The model represents a significant advancement in Catalan language processing, achieving impressive Word Error Rates (WER) across different test scenarios. It requires 16kHz audio input and leverages the powerful wav2vec2 architecture for optimal performance.

  • Achieves 6.92% WER on combined test split
  • 12.99% WER on Google Crowdsourced Corpus
  • 13.23% WER on "La llegenda de Sant Jordi" audiobook
  • Implements PyTorch framework

Core Capabilities

  • Direct speech-to-text transcription without language model
  • Optimized for 16kHz audio processing
  • Robust performance across various Catalan speech contexts
  • Suitable for both formal and informal speech recognition tasks

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized focus on Catalan language processing, combining multiple high-quality datasets and achieving state-of-the-art performance for Catalan speech recognition with a notably low WER of 6.92% on test data.

Q: What are the recommended use cases?

The model is ideal for Catalan speech transcription tasks, including parliamentary speech processing, audiobook transcription, and general-purpose speech recognition applications where Catalan language support is required. It's particularly effective for applications requiring 16kHz audio input.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.