wav2vec2-xls-r-300m-sk-cv8

wav2vec2-xls-r-300m-sk-cv8

comodoro

A speech recognition model fine-tuned on Slovak Common Voice 8.0, achieving 49.57% WER and 13.33% CER, based on wav2vec2-xls-r-300m architecture

PropertyValue
Base Modelfacebook/wav2vec2-xls-r-300m
TaskSpeech Recognition (Slovak)
PerformanceWER: 49.57%, CER: 13.33%
Authorcomodoro
Model LinkHugging Face

What is wav2vec2-xls-r-300m-sk-cv8?

This is a specialized speech recognition model fine-tuned for the Slovak language, based on Facebook's wav2vec2-xls-r-300m architecture. The model has been specifically trained on the Common Voice 8.0 dataset to provide accurate speech-to-text capabilities for Slovak language applications.

Implementation Details

The model implements a CTC-based speech recognition approach using the wav2vec2 architecture. It operates on 16kHz audio input and has been trained using advanced optimization techniques including native AMP mixed precision training.

  • Learning Rate: 7e-4 with linear scheduler and 500 warmup steps
  • Batch Size: 32 (640 total with gradient accumulation)
  • Training Duration: 50 epochs
  • Optimizer: Adam (β1=0.9, β2=0.999, ε=1e-08)

Core Capabilities

  • Direct speech-to-text transcription without requiring a language model
  • Handles 16kHz audio input (with included resampling capability)
  • Batch processing support with attention masking
  • Optimized for Slovak language recognition

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Slovak language speech recognition, leveraging the powerful wav2vec2-xls-r-300m architecture while being fine-tuned on the Common Voice 8.0 dataset. Its direct use without requiring a language model makes it particularly practical for Slovak speech recognition tasks.

Q: What are the recommended use cases?

The model is ideal for Slovak speech transcription tasks, particularly in applications requiring real-time or batch processing of audio content. It's suitable for applications like voice assistants, transcription services, and audio content analysis tools focused on Slovak language content.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026