Belle-whisper-large-v3-zh-punct

Maintained By
BELLE-2

Belle-whisper-large-v3-zh-punct

PropertyValue
LicenseApache 2.0
AuthorBELLE-2
FrameworkPyTorch
TaskAutomatic Speech Recognition

What is Belle-whisper-large-v3-zh-punct?

Belle-whisper-large-v3-zh-punct is an advanced Chinese ASR model that enhances the original Whisper large-v3 architecture with improved punctuation capabilities. This model maintains excellent performance on standard benchmarks while adding sophisticated punctuation mark handling derived from the punc_ct-transformer model.

Implementation Details

The model implements LoRA fine-tuning techniques on multiple Chinese speech datasets including AISHELL-1, AISHELL-2, WenetSpeech, and HKUST. It operates at a 16KHz sample rate and achieves impressive Character Error Rates (CER) across various benchmarks.

  • Advanced punctuation handling while maintaining base ASR performance
  • Optimized for complex acoustic environments
  • Demonstrates improved performance in meeting scenarios (10.973% CER on WenetSpeech meeting dataset)

Core Capabilities

  • State-of-the-art Chinese speech recognition with CER as low as 2.945% on AISHELL-1
  • Enhanced punctuation mark processing
  • Robust performance across various acoustic environments
  • Seamless integration with the Transformers pipeline

Frequently Asked Questions

Q: What makes this model unique?

This model uniquely combines high-quality Chinese ASR capabilities with enhanced punctuation handling, achieved through careful LoRA fine-tuning without compromising the base model's performance.

Q: What are the recommended use cases?

The model is particularly well-suited for Chinese speech transcription tasks requiring accurate punctuation, especially in challenging acoustic environments like meetings and complex audio scenarios.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.