whisper-large-v3-hindi

Maintained By
kasunw

Whisper Large V3 Hindi

PropertyValue
Base Modelopenai/whisper-large-v3
Training MethodLoRA Fine-tuning
TaskHindi Speech Recognition
DatasetCommon Voice 13.0 Hindi
Model Authorkasunw

What is whisper-large-v3-hindi?

Whisper-large-v3-hindi is a specialized automatic speech recognition (ASR) model fine-tuned specifically for Hindi language processing. Built upon OpenAI's Whisper large-v3 architecture, this model leverages LoRA (Low-Rank Adaptation) training techniques to optimize performance for Hindi speech recognition while maintaining computational efficiency.

Implementation Details

The model implementation utilizes the PEFT (Parameter-Efficient Fine-Tuning) framework and Transformers library. It's designed to run with FP16 precision on compatible hardware and includes optimizations for batch processing of audio segments.

  • Supports processing of 30-second audio chunks
  • Implements batch processing with size 16
  • Includes timestamp generation capability
  • Optimized for both CPU and GPU deployment

Core Capabilities

  • Hindi speech-to-text transcription
  • Efficient processing of long audio files through chunking
  • Timestamp generation for word alignment
  • Support for both inference and fine-tuning workflows

Frequently Asked Questions

Q: What makes this model unique?

This model combines the robust capabilities of Whisper large-v3 with specialized Hindi language optimization through LoRA fine-tuning, making it particularly effective for Hindi ASR tasks while maintaining memory efficiency.

Q: What are the recommended use cases?

The model is ideal for Hindi speech transcription tasks, including subtitle generation, voice command processing, and general speech-to-text applications focusing on Hindi language content. It's particularly suitable for applications requiring batch processing of audio files.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.