icefall_asr_tal-csasr_pruned_transducer_stateless5

Maintained By
luomingshuang

Icefall ASR TAL-CSASR Pruned Transducer Stateless5

PropertyValue
Authorluomingshuang
FrameworkIcefall/K2
DatasetTAL_CSASR
Training Duration30 epochs

What is icefall_asr_tal-csasr_pruned_transducer_stateless5?

This is a state-of-the-art automatic speech recognition (ASR) model built using the Icefall framework and trained on the TAL_CSASR dataset. The model implements a pruned transducer architecture with stateless decoding, specifically designed for handling both Chinese and English speech recognition tasks.

Implementation Details

The model is implemented using the K2 speech recognition toolkit and trained for 30 epochs on far-field audio data. It employs a pruned transducer architecture with stateless decoding, which helps achieve efficient inference while maintaining high accuracy.

  • Trained using distributed training across 6 GPUs
  • Supports multiple decoding methods including greedy search, modified beam search, and fast beam search
  • Implements model averaging for improved performance

Core Capabilities

  • Achieves 7.15% CER on dev set and 7.22% on test set using modified beam search with averaged model
  • Handles both Chinese (CER) and English (WER) recognition tasks
  • Chinese performance: 6.35% CER on dev set, 6.50% CER on test set
  • English performance: 18.95% WER on dev set, 18.70% WER on test set

Frequently Asked Questions

Q: What makes this model unique?

The model uniquely combines pruned transducer architecture with stateless decoding, offering efficient inference for both Chinese and English speech recognition. It achieves competitive error rates while supporting multiple decoding strategies.

Q: What are the recommended use cases?

This model is particularly suited for far-field speech recognition applications requiring bilingual (Chinese-English) capabilities. It's ideal for scenarios where both accuracy and inference efficiency are important considerations.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.