Whisper-OMG-2
Property | Value |
---|---|
Base Model | Whisper Enhanced ML |
Training Dataset | Common Voice 11.0 |
Best WER | 0.0266 |
Framework | PyTorch 2.5.0 |
What is whisper-omg-2?
Whisper-OMG-2 is a fine-tuned version of the Whisper Enhanced ML model, specifically optimized for speech recognition tasks. Developed by nurzhanit, this model demonstrates exceptional performance with a Word Error Rate (WER) of just 0.0266 on the evaluation set.
Implementation Details
The model was trained using the Adam optimizer with carefully tuned hyperparameters (β1=0.9, β2=0.999, ε=1e-08). The training process involved a linear learning rate scheduler with 50 warmup steps and continued for 500 training steps. The learning rate was set to 1e-05, with batch sizes of 16 for training and 8 for evaluation.
- Achieved final validation loss of 0.0002
- Implemented with Transformers 4.40.0
- Uses PyTorch 2.5.0+cu124 backend
Core Capabilities
- Superior speech recognition accuracy (0.0266 WER)
- Efficient training convergence within 500 steps
- Robust performance stability in later training epochs
- Optimized for Common Voice dataset applications
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its exceptional WER of 0.0266, achieved through careful optimization and fine-tuning on the Common Voice 11.0 dataset. The training progression shows remarkable stability after step 300.
Q: What are the recommended use cases?
This model is particularly suited for speech recognition tasks, especially those utilizing Common Voice dataset-like audio inputs. It's optimized for production environments using PyTorch 2.5.0.