lite-whisper-large-v3-turbo-fast

Maintained By
efficient-speech

Lite-Whisper Large-v3-turbo-fast

PropertyValue
Model TypeSpeech Recognition
Encoder Size313M parameters
Decoder Size172M parameters
Average WER20.1%
Authorefficient-speech
RepositoryHuggingFace

What is lite-whisper-large-v3-turbo-fast?

Lite-Whisper large-v3-turbo-fast is a highly compressed version of OpenAI's Whisper model, developed using LiteASR technology. It represents a significant achievement in model compression, reducing the encoder size to 313M parameters while maintaining the turbo decoder at 172M parameters.

Implementation Details

The model implements an optimized architecture that prioritizes speed over accuracy, making it suitable for applications where rapid processing is crucial. Compared to the original Whisper large-v3 (635M encoder), this model achieves substantial parameter reduction while maintaining functional speech recognition capabilities.

  • Compressed encoder: 313M parameters (down from 635M)
  • Turbo decoder: 172M parameters
  • Optimized for faster inference
  • Word Error Rate (WER): 20.1% on ESB datasets

Core Capabilities

  • Fast speech recognition processing
  • Reduced memory footprint
  • Efficient deployment potential
  • Balance between speed and accuracy

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its significant compression rate while maintaining the turbo decoder architecture. It's specifically designed for scenarios where processing speed is prioritized over maximum accuracy.

Q: What are the recommended use cases?

The model is best suited for applications requiring real-time or near-real-time speech recognition where some trade-off in accuracy is acceptable. It's particularly valuable in resource-constrained environments or when processing speed is crucial.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.