Lite-Whisper Large-v3-Turbo
Property | Value |
---|---|
Encoder Size | 374M parameters |
Decoder Size | 172M parameters |
WER Performance | 12.6% |
Model Type | Speech Recognition |
Paper | LiteASR Paper |
What is lite-whisper-large-v3-turbo?
Lite-Whisper large-v3-turbo is an optimized version of OpenAI's Whisper model, developed using LiteASR technology. It achieves significant model compression while maintaining reasonable performance, featuring a compressed encoder of 374M parameters and a streamlined decoder of 172M parameters.
Implementation Details
The model implements innovative compression techniques through LiteASR, resulting in substantial size reduction compared to the original Whisper large-v3 model (635M encoder, 907M decoder). This compression is achieved while maintaining a competitive Word Error Rate (WER) of 12.6% on the ESB datasets.
- Compressed encoder architecture (374M parameters)
- Optimized decoder design (172M parameters)
- Efficient parameter utilization through low-rank approximation
- Balance between model size and performance
Core Capabilities
- Automatic Speech Recognition (ASR) with competitive accuracy
- Efficient processing with reduced computational requirements
- Maintains reasonable performance despite significant compression
- Suitable for resource-constrained environments
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its efficient compression of the Whisper architecture while maintaining useful performance levels. It achieves a 41% reduction in encoder size and 81% reduction in decoder size compared to the original model.
Q: What are the recommended use cases?
The model is ideal for applications requiring efficient speech recognition where computational resources are limited. It's particularly suitable for deployment in environments where model size is a constraint but reasonable accuracy is still required.