Lite-Whisper Large-v3-turbo-fast

Property	Value
Model Type	Speech Recognition
Encoder Size	313M parameters
Decoder Size	172M parameters
Average WER	20.1%
Author	efficient-speech
Repository	HuggingFace

What is lite-whisper-large-v3-turbo-fast?

Lite-Whisper large-v3-turbo-fast is a highly compressed version of OpenAI's Whisper model, developed using LiteASR technology. It represents a significant achievement in model compression, reducing the encoder size to 313M parameters while maintaining the turbo decoder at 172M parameters.

Implementation Details

The model implements an optimized architecture that prioritizes speed over accuracy, making it suitable for applications where rapid processing is crucial. Compared to the original Whisper large-v3 (635M encoder), this model achieves substantial parameter reduction while maintaining functional speech recognition capabilities.

Compressed encoder: 313M parameters (down from 635M)
Turbo decoder: 172M parameters
Optimized for faster inference
Word Error Rate (WER): 20.1% on ESB datasets

Core Capabilities

Fast speech recognition processing
Reduced memory footprint
Efficient deployment potential
Balance between speed and accuracy

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its significant compression rate while maintaining the turbo decoder architecture. It's specifically designed for scenarios where processing speed is prioritized over maximum accuracy.

Q: What are the recommended use cases?

The model is best suited for applications requiring real-time or near-real-time speech recognition where some trade-off in accuracy is acceptable. It's particularly valuable in resource-constrained environments or when processing speed is crucial.