wav2vec2-xls-r-300m-sk-cv8

Maintained By
comodoro

wav2vec2-xls-r-300m-sk-cv8

PropertyValue
Base Modelfacebook/wav2vec2-xls-r-300m
TaskSpeech Recognition (Slovak)
PerformanceWER: 49.57%, CER: 13.33%
Authorcomodoro
Model LinkHugging Face

What is wav2vec2-xls-r-300m-sk-cv8?

This is a specialized speech recognition model fine-tuned for the Slovak language, based on Facebook's wav2vec2-xls-r-300m architecture. The model has been specifically trained on the Common Voice 8.0 dataset to provide accurate speech-to-text capabilities for Slovak language applications.

Implementation Details

The model implements a CTC-based speech recognition approach using the wav2vec2 architecture. It operates on 16kHz audio input and has been trained using advanced optimization techniques including native AMP mixed precision training.

  • Learning Rate: 7e-4 with linear scheduler and 500 warmup steps
  • Batch Size: 32 (640 total with gradient accumulation)
  • Training Duration: 50 epochs
  • Optimizer: Adam (β1=0.9, β2=0.999, ε=1e-08)

Core Capabilities

  • Direct speech-to-text transcription without requiring a language model
  • Handles 16kHz audio input (with included resampling capability)
  • Batch processing support with attention masking
  • Optimized for Slovak language recognition

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Slovak language speech recognition, leveraging the powerful wav2vec2-xls-r-300m architecture while being fine-tuned on the Common Voice 8.0 dataset. Its direct use without requiring a language model makes it particularly practical for Slovak speech recognition tasks.

Q: What are the recommended use cases?

The model is ideal for Slovak speech transcription tasks, particularly in applications requiring real-time or batch processing of audio content. It's suitable for applications like voice assistants, transcription services, and audio content analysis tools focused on Slovak language content.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.