Whisper Medium French

Property	Value
License	Apache 2.0
Training Dataset	Common Voice 11.0
Best WER Score	11.14% (normalized)
Framework	PyTorch

What is whisper-medium-french?

Whisper-medium-french is a specialized speech recognition model fine-tuned from OpenAI's Whisper medium model, specifically optimized for French language transcription. This model achieves state-of-the-art performance with a normalized Word Error Rate (WER) of 11.14%, significantly improving upon the original Whisper medium model's performance on French content.

Implementation Details

The model was trained using a carefully crafted procedure with Adam optimizer, linear learning rate scheduling, and mixed precision training. Key training parameters include a learning rate of 1e-05, batch size of 32, and 5000 training steps with 500 warmup steps.

Native AMP mixed precision training
Trained on Common Voice 11.0 dataset
Achieves 15.89% raw WER and 11.14% normalized WER
Implements Transformer architecture with PyTorch

Core Capabilities

High-accuracy French speech transcription
Improved performance over original Whisper medium model
Robust handling of various French accents and speaking styles
Optimized for production deployment

Frequently Asked Questions

Q: What makes this model unique?

This model achieves better performance than the original Whisper medium model on French transcription, with a normalized WER of 11.14% compared to the original 16.0% on Common Voice dataset.

Q: What are the recommended use cases?

The model is ideal for French speech transcription tasks, particularly in applications requiring high accuracy such as subtitle generation, meeting transcription, and voice command systems.