Lite-Whisper Large-v3-Turbo

Property	Value
Encoder Size	374M parameters
Decoder Size	172M parameters
WER Performance	12.6%
Model Type	Speech Recognition
Paper	LiteASR Paper

What is lite-whisper-large-v3-turbo?

Lite-Whisper large-v3-turbo is an optimized version of OpenAI's Whisper model, developed using LiteASR technology. It achieves significant model compression while maintaining reasonable performance, featuring a compressed encoder of 374M parameters and a streamlined decoder of 172M parameters.

Implementation Details

The model implements innovative compression techniques through LiteASR, resulting in substantial size reduction compared to the original Whisper large-v3 model (635M encoder, 907M decoder). This compression is achieved while maintaining a competitive Word Error Rate (WER) of 12.6% on the ESB datasets.

Compressed encoder architecture (374M parameters)
Optimized decoder design (172M parameters)
Efficient parameter utilization through low-rank approximation
Balance between model size and performance

Core Capabilities

Automatic Speech Recognition (ASR) with competitive accuracy
Efficient processing with reduced computational requirements
Maintains reasonable performance despite significant compression
Suitable for resource-constrained environments

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient compression of the Whisper architecture while maintaining useful performance levels. It achieves a 41% reduction in encoder size and 81% reduction in decoder size compared to the original model.

Q: What are the recommended use cases?

The model is ideal for applications requiring efficient speech recognition where computational resources are limited. It's particularly suitable for deployment in environments where model size is a constraint but reasonable accuracy is still required.