Belle-whisper-large-v3-turbo-zh
Property | Value |
---|---|
License | Apache 2.0 |
Base Model | openai/whisper-large-v3-turbo |
Pipeline Tag | Automatic Speech Recognition |
Framework | PyTorch & Transformers |
What is Belle-whisper-large-v3-turbo-zh?
Belle-whisper-large-v3-turbo-zh is an advanced Chinese speech recognition model that builds upon OpenAI's Whisper architecture. It's specifically fine-tuned to enhance Chinese ASR capabilities, achieving remarkable improvements of 24-64% over the base model across various Chinese speech recognition benchmarks.
Implementation Details
The model has been fine-tuned using multiple prestigious Chinese speech datasets, including AISHELL-1, AISHELL-2, WenetSpeech, and HKUST. It incorporates sophisticated punctuation handling through integration with the punc_ct-transformer model, ensuring more natural and readable transcriptions.
- Supports 16KHz audio input
- Implements full fine-tuning approach
- Includes automatic punctuation integration
- Optimized for Chinese language processing
Core Capabilities
- Achieves 3.07% CER on AISHELL-1 test set
- Demonstrates superior performance on meeting transcriptions with 13.357% CER
- Handles various Chinese speech contexts effectively
- Seamless integration with Transformers pipeline
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its significant performance improvements over the base Whisper model, particularly in Chinese speech recognition. It achieves this through comprehensive fine-tuning on multiple Chinese speech datasets and integration of advanced punctuation handling.
Q: What are the recommended use cases?
The model is ideal for Chinese speech transcription tasks, particularly in scenarios requiring high accuracy such as meeting transcriptions, general speech recognition, and applications requiring precise Chinese language processing with proper punctuation.