whisper-telugu-large-v2

Maintained By
vasista22

Whisper Telugu Large-v2

PropertyValue
LicenseApache 2.0
LanguageTelugu
Base ModelWhisper Large-v2
WER Score9.65

What is whisper-telugu-large-v2?

Whisper-telugu-large-v2 is a specialized automatic speech recognition (ASR) model fine-tuned from OpenAI's Whisper Large-v2 specifically for the Telugu language. Developed at Speech Lab, IIT Madras, this model represents a significant advancement in Telugu speech recognition, trained on an extensive collection of Telugu speech corpora including CSTD IIIT-H, ULCA, Shrutilipi, and Microsoft Speech Corpus.

Implementation Details

The model employs a sophisticated training approach with carefully tuned hyperparameters, including a learning rate of 0.75e-05, batch size of 8, and 75,000 training steps. It utilizes mixed precision training with the AdamW optimizer and implements a linear learning rate scheduler with 22,000 warmup steps.

  • Supports both PyTorch and JAX-based inference
  • Optimized for 30-second audio chunks
  • Includes specialized decoder prompts for Telugu language
  • Implements 8-bit optimization for improved efficiency

Core Capabilities

  • Achieves 9.65 WER on Google FLEURS test set
  • Handles diverse Telugu speech patterns and accents
  • Supports batch processing for faster inference
  • Compatible with both CPU and GPU environments

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its comprehensive training on multiple Telugu speech corpora and its optimization for production environments, supporting both PyTorch and JAX-based inference pipelines. The achieved WER of 9.65 demonstrates its high accuracy in Telugu speech recognition.

Q: What are the recommended use cases?

The model is ideal for Telugu speech transcription tasks, particularly in applications requiring high accuracy and processing of longer audio segments. It's suitable for both research and production environments, with flexible deployment options using either PyTorch or JAX.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.