TeleSpeech-ASR1.0

TeleSpeech-ASR1.0

Tele-AI

A sophisticated multi-dialect speech recognition model trained on 300K hours of unlabeled audio data, supporting 30 Chinese dialects including Cantonese, Shanghai, and Sichuan dialects.

PropertyValue
LicenseApache-2.0
Base Model Size0.09B parameters
Large Model Size0.3B parameters
Training Data300K hours unlabeled + 30 labeled dialects

What is TeleSpeech-ASR1.0?

TeleSpeech-ASR1.0 is a groundbreaking multi-dialect speech recognition model developed by Tele-AI. It represents a significant advancement in handling Chinese dialect recognition by overcoming the traditional limitation of single-dialect models. The model is pre-trained on 300,000 hours of unlabeled multi-dialect speech data and fine-tuned with 30 different labeled dialects.

Implementation Details

The model is released in three variants: two pre-trained models (base and large) and one fine-tuned model. The base model contains 0.09B parameters, while the large model scales up to 0.3B parameters. The fine-tuned version is specifically optimized for the KeSpeech dataset covering 8 major Chinese dialects.

  • Pre-trained base model: 0.09B parameters for feature extraction
  • Pre-trained large model: 0.3B parameters with enhanced capabilities
  • Fine-tuned KeSpeech model: Optimized for practical dialect recognition

Core Capabilities

  • Multi-dialect recognition spanning 30 Chinese dialects
  • Support for major dialects including Cantonese, Shanghai, Sichuan, and Wenzhou
  • Character Error Rate (CER) as low as 4.0% on Aishell-1
  • Robust performance across various test sets including WenetSpeech and Babel

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to handle multiple dialects simultaneously with a single model architecture sets it apart from traditional single-dialect ASR systems. Its extensive pre-training on 300K hours of unlabeled data provides robust feature extraction capabilities.

Q: What are the recommended use cases?

The model is ideal for applications requiring multi-dialect Chinese speech recognition, particularly in scenarios involving regional dialect variations. It's suitable for both academic research and commercial applications, though commercial use requires specific licensing.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026