asr-wav2vec2-dvoice-darija

asr-wav2vec2-dvoice-darija

speechbrain

Specialized ASR model for Darija (Moroccan Arabic) using wav2vec 2.0, achieving 18.28% WER on test data. Features CTC/Attention architecture and unigram tokenization.

PropertyValue
Model TypeSpeech Recognition (ASR)
Architecturewav2vec 2.0 + CTC/Attention
Performance18.28% WER (Test), 5.85% CER (Test)
SourceHuggingFace

What is asr-wav2vec2-dvoice-darija?

This is a specialized automatic speech recognition model designed specifically for Darija (Moroccan Arabic dialect), developed as part of the DVoice initiative. It combines Facebook's wav2vec 2.0 architecture with CTC/Attention mechanisms, trained on the DVoice Darija dataset. The model represents a significant advancement in ASR technology for low-resource African languages.

Implementation Details

The model architecture consists of two main components: a unigram tokenizer for subword unit transformation and an acoustic model based on wav2vec 2.0. It utilizes the facebook/wav2vec2-large-xlsr-53 pretrained model as its foundation, enhanced with two additional DNN layers fine-tuned on Darija speech data.

  • Supports 16kHz audio input (single channel)
  • Automatic audio normalization capabilities
  • Implements CTC greedy decoder for inference
  • Built using the SpeechBrain framework

Core Capabilities

  • Direct transcription of Darija speech to text
  • Achieves 18.28% Word Error Rate on test data
  • Supports GPU inference for faster processing
  • Handles automatic audio preprocessing

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically designed for Darija, a traditionally under-resourced language. It's part of the DVoice initiative, which aims to improve voice technology access for African languages. The combination of wav2vec 2.0 with CTC/Attention mechanisms makes it particularly effective for Darija speech recognition.

Q: What are the recommended use cases?

The model is ideal for transcribing Darija speech in various applications, including voice assistants, transcription services, and speech-to-text applications. It's particularly suitable for applications requiring Moroccan Arabic dialect understanding.

Socials
Integrations
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026