wav2vec2-large-xlsr-53-th-cv8-newmm

wav2vec2-large-xlsr-53-th-cv8-newmm

wannaphong

Thai speech recognition model based on wav2vec2, trained on CommonVoice V8 dataset. Achieves 12.58% WER with newmm tokenizer and language model.

PropertyValue
LicenseApache 2.0
PaperThai Wav2Vec2.0 with CommonVoice V8
LanguageThai
FrameworkPyTorch

What is wav2vec2-large-xlsr-53-th-cv8-newmm?

This is a state-of-the-art Thai automatic speech recognition model that builds upon the wav2vec2-large-xlsr-53 architecture. It's specifically fine-tuned on the Thai CommonVoice V8 dataset, incorporating improvements over the V7 dataset version. The model utilizes a newmm tokenizer along with a language model to achieve superior performance in Thai speech recognition tasks.

Implementation Details

The model is built on Facebook's wav2vec2-large-xlsr-53 architecture and implements several key technical innovations:

  • Pre-tokenization using pythainlp.tokenize.word_tokenize
  • Integration with CommonVoice V8 dataset, enhanced from V7
  • Improved training methodology with bug fixes from the original implementation
  • Combined language model approach for better accuracy

Core Capabilities

  • Achieves 12.58% WER (Word Error Rate) with newmm tokenizer on CV8 testset
  • Demonstrates 3.27% CER (Character Error Rate)
  • Supports Thai language speech recognition with high accuracy
  • Performs better than previous versions on both CV7 and CV8 testsets

Frequently Asked Questions

Q: What makes this model unique?

This model stands out due to its improved performance over previous versions, achieving lower WER and CER rates through the combination of wav2vec2 architecture with newmm tokenization and language modeling. It's specifically optimized for Thai language processing.

Q: What are the recommended use cases?

The model is ideal for Thai speech recognition tasks, particularly in applications requiring high accuracy transcription of Thai speech. It's suitable for both academic research and practical applications in speech-to-text conversion for Thai language content.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026