whisper-medium-portuguese

Maintained By
pierreguillou

Whisper Medium Portuguese

PropertyValue
LicenseApache 2.0
Training DatasetCommon Voice 11.0
Best WER Score6.59%
Training Steps6000

What is whisper-medium-portuguese?

Whisper Medium Portuguese is a fine-tuned version of OpenAI's Whisper Medium model, specifically optimized for Portuguese speech recognition. This model achieves state-of-the-art performance with a Word Error Rate (WER) of 6.59%, surpassing both the original Whisper Medium (8.1% WER) and even Whisper Large (7.1% WER) on Portuguese transcription tasks.

Implementation Details

The model was trained using a carefully tuned configuration with Adam optimizer, linear learning rate scheduling, and mixed precision training. Key training parameters include a learning rate of 9e-06, batch size of 32, and 6000 training steps with 500 warmup steps.

  • Native AMP (Automatic Mixed Precision) training implementation
  • Trained on Mozilla Common Voice 11.0 dataset
  • Achieved optimal performance at epoch 5.05 with validation loss of 0.2628

Core Capabilities

  • State-of-the-art Portuguese speech recognition
  • Robust performance on varied Portuguese audio inputs
  • Significantly improved accuracy compared to base Whisper models
  • Optimized for production deployment

Frequently Asked Questions

Q: What makes this model unique?

This model achieves better Portuguese transcription accuracy than both Whisper Medium and Large models, with a WER of 6.59% compared to the original 8.1%, making it the current SOTA for Portuguese ASR.

Q: What are the recommended use cases?

The model is ideal for Portuguese speech transcription tasks, including subtitle generation, audio content indexing, and voice command systems requiring high accuracy in Portuguese language processing.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.