Lite-Whisper Large-v3-turbo-fast
Property | Value |
---|---|
Model Type | Speech Recognition |
Encoder Size | 313M parameters |
Decoder Size | 172M parameters |
Average WER | 20.1% |
Author | efficient-speech |
Repository | HuggingFace |
What is lite-whisper-large-v3-turbo-fast?
Lite-Whisper large-v3-turbo-fast is a highly compressed version of OpenAI's Whisper model, developed using LiteASR technology. It represents a significant achievement in model compression, reducing the encoder size to 313M parameters while maintaining the turbo decoder at 172M parameters.
Implementation Details
The model implements an optimized architecture that prioritizes speed over accuracy, making it suitable for applications where rapid processing is crucial. Compared to the original Whisper large-v3 (635M encoder), this model achieves substantial parameter reduction while maintaining functional speech recognition capabilities.
- Compressed encoder: 313M parameters (down from 635M)
- Turbo decoder: 172M parameters
- Optimized for faster inference
- Word Error Rate (WER): 20.1% on ESB datasets
Core Capabilities
- Fast speech recognition processing
- Reduced memory footprint
- Efficient deployment potential
- Balance between speed and accuracy
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its significant compression rate while maintaining the turbo decoder architecture. It's specifically designed for scenarios where processing speed is prioritized over maximum accuracy.
Q: What are the recommended use cases?
The model is best suited for applications requiring real-time or near-real-time speech recognition where some trade-off in accuracy is acceptable. It's particularly valuable in resource-constrained environments or when processing speed is crucial.