Belle-whisper-large-v3-zh-punct

Property	Value
License	Apache 2.0
Author	BELLE-2
Framework	PyTorch
Task	Automatic Speech Recognition

What is Belle-whisper-large-v3-zh-punct?

Belle-whisper-large-v3-zh-punct is an advanced Chinese ASR model that enhances the original Whisper large-v3 architecture with improved punctuation capabilities. This model maintains excellent performance on standard benchmarks while adding sophisticated punctuation mark handling derived from the punc_ct-transformer model.

Implementation Details

The model implements LoRA fine-tuning techniques on multiple Chinese speech datasets including AISHELL-1, AISHELL-2, WenetSpeech, and HKUST. It operates at a 16KHz sample rate and achieves impressive Character Error Rates (CER) across various benchmarks.

Advanced punctuation handling while maintaining base ASR performance
Optimized for complex acoustic environments
Demonstrates improved performance in meeting scenarios (10.973% CER on WenetSpeech meeting dataset)

Core Capabilities

State-of-the-art Chinese speech recognition with CER as low as 2.945% on AISHELL-1
Enhanced punctuation mark processing
Robust performance across various acoustic environments
Seamless integration with the Transformers pipeline

Frequently Asked Questions

Q: What makes this model unique?

This model uniquely combines high-quality Chinese ASR capabilities with enhanced punctuation handling, achieved through careful LoRA fine-tuning without compromising the base model's performance.

Q: What are the recommended use cases?

The model is particularly well-suited for Chinese speech transcription tasks requiring accurate punctuation, especially in challenging acoustic environments like meetings and complex audio scenarios.