whisper-tamil-medium

whisper-tamil-medium

vasista22

A fine-tuned Whisper ASR model specialized for Tamil language, achieving 6.5% WER on Common Voice test set, trained on multiple Tamil ASR corpuses.

PropertyValue
LicenseApache 2.0
Base ModelOpenAI Whisper Medium
Training FrameworkPyTorch
Primary TaskAutomatic Speech Recognition

What is whisper-tamil-medium?

Whisper-tamil-medium is a specialized automatic speech recognition (ASR) model fine-tuned from OpenAI's Whisper-medium specifically for Tamil language processing. Developed at Speech Lab, IIT Madras, this model demonstrates impressive performance with a Word Error Rate (WER) of 6.5% on Common Voice test set and 6.97% on Google Fleurs test set.

Implementation Details

The model was trained using a comprehensive dataset combining multiple Tamil ASR corpuses including IISc-MILE, ULCA, Shrutilipi, Microsoft Speech Corpus, Google/Fleurs, and Babel ASR Corpus. Training utilized 8-bit AdamW optimizer with a linear learning rate scheduler, implementing mixed precision training for optimal performance.

  • Learning rate: 1e-05 with 17,500 warmup steps
  • Batch size: 24 (training) / 48 (evaluation)
  • Total training steps: 33,892
  • Mixed precision training enabled

Core Capabilities

  • High-accuracy Tamil speech recognition
  • Supports both CPU and GPU inference
  • Compatible with whisper-jax for faster inference
  • Optimized for production deployment

Frequently Asked Questions

Q: What makes this model unique?

This model stands out due to its specialized fine-tuning for Tamil language using a diverse range of high-quality datasets and its impressive WER scores on standard benchmarks. It's also optimized for both accuracy and inference speed.

Q: What are the recommended use cases?

The model is ideal for Tamil speech transcription tasks, particularly in applications requiring high accuracy such as subtitling, content moderation, and speech analytics. It can be deployed in both research and production environments.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026