whisper-large-v2-bg

Maintained By
anuragshas

Whisper Large-v2 Bulgarian

PropertyValue
LicenseApache 2.0
Training DatasetMozilla Common Voice 11.0 (Bulgarian)
WER Score13.404%
FrameworkPyTorch 1.13.0

What is whisper-large-v2-bg?

This is a specialized Bulgarian speech recognition model based on OpenAI's Whisper Large-v2 architecture. It has been fine-tuned on the Mozilla Common Voice 11.0 Bulgarian dataset to provide accurate automatic speech recognition (ASR) capabilities specifically for the Bulgarian language.

Implementation Details

The model was trained using a distributed multi-GPU setup with carefully tuned hyperparameters. Training was conducted over 1000 steps with a linear learning rate scheduler and warm-up period of 100 steps. The implementation uses the Adam optimizer with a learning rate of 1e-05 and batch sizes of 32 for training and 16 for evaluation.

  • Training Loss: 0.0023
  • Validation Loss: 0.3208
  • Word Error Rate: 13.404%
  • Epoch Completion: 7.04

Core Capabilities

  • Bulgarian speech-to-text transcription
  • Compatible with the Transformers library
  • Optimized for production deployment
  • Supports TensorBoard integration

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized optimization for Bulgarian language processing, achieving a competitive WER of 13.404% on the Common Voice test set. It builds upon the robust Whisper Large-v2 architecture while being specifically adapted for Bulgarian ASR tasks.

Q: What are the recommended use cases?

The model is ideal for Bulgarian speech recognition applications including transcription services, voice assistants, and content accessibility tools. It's particularly suited for production environments requiring reliable Bulgarian speech-to-text conversion.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.