Whisper Medium French
Property | Value |
---|---|
License | Apache 2.0 |
Training Dataset | Common Voice 11.0 |
Best WER Score | 11.14% (normalized) |
Framework | PyTorch |
What is whisper-medium-french?
Whisper-medium-french is a specialized speech recognition model fine-tuned from OpenAI's Whisper medium model, specifically optimized for French language transcription. This model achieves state-of-the-art performance with a normalized Word Error Rate (WER) of 11.14%, significantly improving upon the original Whisper medium model's performance on French content.
Implementation Details
The model was trained using a carefully crafted procedure with Adam optimizer, linear learning rate scheduling, and mixed precision training. Key training parameters include a learning rate of 1e-05, batch size of 32, and 5000 training steps with 500 warmup steps.
- Native AMP mixed precision training
- Trained on Common Voice 11.0 dataset
- Achieves 15.89% raw WER and 11.14% normalized WER
- Implements Transformer architecture with PyTorch
Core Capabilities
- High-accuracy French speech transcription
- Improved performance over original Whisper medium model
- Robust handling of various French accents and speaking styles
- Optimized for production deployment
Frequently Asked Questions
Q: What makes this model unique?
This model achieves better performance than the original Whisper medium model on French transcription, with a normalized WER of 11.14% compared to the original 16.0% on Common Voice dataset.
Q: What are the recommended use cases?
The model is ideal for French speech transcription tasks, particularly in applications requiring high accuracy such as subtitle generation, meeting transcription, and voice command systems.