whisper-small-cantonese

whisper-small-cantonese

alvanlii

A fine-tuned Whisper small model for Cantonese speech recognition, achieving 7.93% CER without punctuation. Features fast inference and extensive training on diverse Cantonese datasets.

PropertyValue
Parameter Count242M
LicenseApache 2.0
PaperResearch Paper
Model TypeAutomatic Speech Recognition
CER Score7.93% (without punctuation)

What is whisper-small-cantonese?

Whisper-small-cantonese is a specialized speech recognition model fine-tuned from OpenAI's Whisper-small architecture specifically for Cantonese language processing. This model represents a significant advancement in Cantonese ASR, trained on over 934 hours of diverse data including Common Voice, CantoMap, and YouTube content.

Implementation Details

The model utilizes a transformer-based architecture with several optimizations for performance. It supports both standard and Flash Attention implementations, with the latter reducing inference time from 0.308s to 0.055s per sample on GPU.

  • GPU VRAM Usage: ~1.5GB
  • Supports speculative decoding for faster processing
  • Compatible with Whisper.cpp and WhisperX/FasterWhisper via CT2

Core Capabilities

  • Fast inference with Flash Attention support
  • Excellent accuracy with 7.93% CER (without punctuation)
  • Efficient processing of long-form audio
  • Flexible deployment options (CPU/GPU)

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized optimization for Cantonese, extensive training data including pseudo-labeled content, and excellent balance of speed and accuracy. It achieves state-of-the-art performance while maintaining reasonable resource requirements.

Q: What are the recommended use cases?

The model is ideal for Cantonese speech transcription tasks, particularly in applications requiring real-time or near-real-time processing. It's suitable for both production environments and research applications, especially when dealing with varied Cantonese dialects and accents.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026