wav2vec2-xls-r-300m-ftspeech

saattrupdan

Danish speech recognition model fine-tuned on FTSpeech dataset (1,800hrs), achieving 17.91% WER on Common Voice. Based on XLS-R-300m with 315M parameters.

Property	Value
Parameter Count	315M
Model Type	Speech Recognition
Base Architecture	wav2vec2-xls-r-300m
License	Danish Parliament License
Language	Danish

What is wav2vec2-xls-r-300m-ftspeech?

This is a specialized Danish speech recognition model fine-tuned on the FTSpeech dataset, containing 1,800 hours of transcribed speeches from the Danish parliament. Built upon Facebook's wav2vec2-xls-r-300m architecture, it represents a significant advancement in Danish language speech recognition technology.

Implementation Details

The model leverages the powerful XLS-R architecture with 315M parameters, optimized for F32 tensor operations. It demonstrates impressive performance metrics, achieving a Word Error Rate (WER) of 17.91% on Danish Common Voice 8.0 and 13.84% on the Alvenir test set when using a 5-gram language model.

Fine-tuned on high-quality parliamentary speech data
Supports both standalone and language model-enhanced inference
Optimized for Danish language processing
Implements the Transformers architecture

Core Capabilities

Automatic speech recognition for Danish language
High accuracy transcription of formal speech
Compatible with PyTorch framework
Supports real-time inference endpoints

Frequently Asked Questions

Q: What makes this model unique?

This model's uniqueness lies in its specialized training on Danish parliamentary speeches, making it particularly effective for formal Danish speech recognition. The combination of the robust XLS-R architecture and extensive training data results in state-of-the-art performance for Danish ASR.

Q: What are the recommended use cases?

The model is ideal for transcribing Danish speech in formal contexts, particularly parliamentary or official speeches. It can be used in both academic and professional settings where accurate Danish language transcription is required.