whisper-omg-2

Maintained By
nurzhanit

Whisper-OMG-2

PropertyValue
Base ModelWhisper Enhanced ML
Training DatasetCommon Voice 11.0
Best WER0.0266
FrameworkPyTorch 2.5.0

What is whisper-omg-2?

Whisper-OMG-2 is a fine-tuned version of the Whisper Enhanced ML model, specifically optimized for speech recognition tasks. Developed by nurzhanit, this model demonstrates exceptional performance with a Word Error Rate (WER) of just 0.0266 on the evaluation set.

Implementation Details

The model was trained using the Adam optimizer with carefully tuned hyperparameters (β1=0.9, β2=0.999, ε=1e-08). The training process involved a linear learning rate scheduler with 50 warmup steps and continued for 500 training steps. The learning rate was set to 1e-05, with batch sizes of 16 for training and 8 for evaluation.

  • Achieved final validation loss of 0.0002
  • Implemented with Transformers 4.40.0
  • Uses PyTorch 2.5.0+cu124 backend

Core Capabilities

  • Superior speech recognition accuracy (0.0266 WER)
  • Efficient training convergence within 500 steps
  • Robust performance stability in later training epochs
  • Optimized for Common Voice dataset applications

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its exceptional WER of 0.0266, achieved through careful optimization and fine-tuning on the Common Voice 11.0 dataset. The training progression shows remarkable stability after step 300.

Q: What are the recommended use cases?

This model is particularly suited for speech recognition tasks, especially those utilizing Common Voice dataset-like audio inputs. It's optimized for production environments using PyTorch 2.5.0.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.