wav2vec2-large-xlsr-catala

wav2vec2-large-xlsr-catala

softcatala

A Catalan speech recognition model based on wav2vec2-large-xlsr-53, achieving 6.92% WER on test data. Optimized for 16kHz audio input.

PropertyValue
LicenseApache 2.0
Downloads40,954
Primary TaskAutomatic Speech Recognition
LanguageCatalan

What is wav2vec2-large-xlsr-catala?

wav2vec2-large-xlsr-catala is a fine-tuned speech recognition model specifically optimized for the Catalan language. Based on Facebook's wav2vec2-large-xlsr-53 architecture, it has been trained on both the Common Voice and ParlamentParla datasets to provide accurate speech-to-text capabilities for Catalan speakers.

Implementation Details

The model represents a significant advancement in Catalan language processing, achieving impressive Word Error Rates (WER) across different test scenarios. It requires 16kHz audio input and leverages the powerful wav2vec2 architecture for optimal performance.

  • Achieves 6.92% WER on combined test split
  • 12.99% WER on Google Crowdsourced Corpus
  • 13.23% WER on "La llegenda de Sant Jordi" audiobook
  • Implements PyTorch framework

Core Capabilities

  • Direct speech-to-text transcription without language model
  • Optimized for 16kHz audio processing
  • Robust performance across various Catalan speech contexts
  • Suitable for both formal and informal speech recognition tasks

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized focus on Catalan language processing, combining multiple high-quality datasets and achieving state-of-the-art performance for Catalan speech recognition with a notably low WER of 6.92% on test data.

Q: What are the recommended use cases?

The model is ideal for Catalan speech transcription tasks, including parliamentary speech processing, audiobook transcription, and general-purpose speech recognition applications where Catalan language support is required. It's particularly effective for applications requiring 16kHz audio input.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026