whisper-tiny-russian-dysarthria

Property	Value
Base Model	OpenAI Whisper-tiny
Task	Russian Speech Recognition
WER Score	9.1029%
Author	qymyz
Model URL	HuggingFace

What is whisper-tiny-russian-dysarthria?

This model is a specialized fine-tuned version of OpenAI's Whisper-tiny model, specifically optimized for processing Russian dysarthric speech. Dysarthria is a motor speech disorder, and this model aims to improve speech recognition accuracy for affected individuals speaking Russian.

Implementation Details

The model was trained using carefully selected hyperparameters, including an Adam optimizer with a learning rate of 1e-05 and linear scheduling. Training was conducted over 3000 steps with a batch size of 16, incorporating native AMP (Automated Mixed Precision) training for optimal performance.

Learning rate: 1e-05 with linear scheduler and 500 warmup steps
Batch size: 16 for both training and evaluation
Training duration: 3000 steps across 30 epochs
Final validation loss: 0.2158

Core Capabilities

Specialized in Russian speech recognition
Optimized for dysarthric speech patterns
Achieves 9.1% Word Error Rate (WER)
Supports real-time transcription with efficient processing

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Russian dysarthric speech, making it particularly valuable for applications involving speakers with speech motor disorders. The low WER of 9.1% demonstrates its effectiveness in this specialized domain.

Q: What are the recommended use cases?

The model is ideal for applications requiring Russian speech recognition for individuals with dysarthria, such as assistive technologies, medical applications, and accessibility tools. It's particularly suitable for scenarios where accurate transcription of impaired speech is crucial.