whisper-tiny-zh

Maintained By
xmzhu

Whisper Tiny Chinese

PropertyValue
LicenseApache 2.0
Training DatasetMozilla Common Voice 11.0 (Chinese)
Best WER Score91.09%
FrameworkPyTorch 1.13.1

What is whisper-tiny-zh?

Whisper-tiny-zh is a compact Chinese speech recognition model based on OpenAI's Whisper architecture, specifically designed for Mandarin Chinese ASR tasks. It's a fine-tuned version of the original whisper-tiny model, optimized for Chinese language processing using the Mozilla Common Voice 11.0 dataset.

Implementation Details

The model was trained using a comprehensive approach with specific hyperparameters including a learning rate of 1e-05, batch size of 64, and Adam optimizer. The training process involved 5000 steps with linear learning rate scheduling and 500 warmup steps. Native AMP mixed precision training was employed for optimal performance.

  • Trained for 11 epochs with decreasing training loss from 0.9397 to 0.3166
  • Implements transformer-based architecture with PyTorch backend
  • Utilizes TensorBoard for training visualization
  • Supports automatic speech recognition specifically for Chinese language

Core Capabilities

  • Specialized Chinese speech recognition
  • Efficient inference with small model footprint
  • Integration with popular ML frameworks
  • Support for batch processing of audio inputs

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its optimization for Chinese language processing while maintaining a small footprint characteristic of the tiny Whisper architecture. It represents a balance between resource efficiency and functional capability for Chinese ASR tasks.

Q: What are the recommended use cases?

The model is best suited for Chinese speech recognition tasks where computational resources are limited. It's ideal for applications requiring real-time transcription of Mandarin Chinese speech, though users should note the current WER of 91.09% when considering accuracy requirements.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.