Whisper Large-v2 Bulgarian

Property	Value
License	Apache 2.0
Training Dataset	Mozilla Common Voice 11.0 (Bulgarian)
WER Score	13.404%
Framework	PyTorch 1.13.0

What is whisper-large-v2-bg?

This is a specialized Bulgarian speech recognition model based on OpenAI's Whisper Large-v2 architecture. It has been fine-tuned on the Mozilla Common Voice 11.0 Bulgarian dataset to provide accurate automatic speech recognition (ASR) capabilities specifically for the Bulgarian language.

Implementation Details

The model was trained using a distributed multi-GPU setup with carefully tuned hyperparameters. Training was conducted over 1000 steps with a linear learning rate scheduler and warm-up period of 100 steps. The implementation uses the Adam optimizer with a learning rate of 1e-05 and batch sizes of 32 for training and 16 for evaluation.

Training Loss: 0.0023
Validation Loss: 0.3208
Word Error Rate: 13.404%
Epoch Completion: 7.04

Core Capabilities

Bulgarian speech-to-text transcription
Compatible with the Transformers library
Optimized for production deployment
Supports TensorBoard integration

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized optimization for Bulgarian language processing, achieving a competitive WER of 13.404% on the Common Voice test set. It builds upon the robust Whisper Large-v2 architecture while being specifically adapted for Bulgarian ASR tasks.

Q: What are the recommended use cases?

The model is ideal for Bulgarian speech recognition applications including transcription services, voice assistants, and content accessibility tools. It's particularly suited for production environments requiring reliable Bulgarian speech-to-text conversion.

whisper-large-v2-bg