wav2vec2-FR-3K-large

Maintained By
LeBenchmark

wav2vec2-FR-3K-large

PropertyValue
Parameter Count317M
LicenseApache 2.0
Training Data2.9K hours French speech
PaperLeBenchmark 2.0 Paper

What is wav2vec2-FR-3K-large?

wav2vec2-FR-3K-large is a large-scale speech representation model developed by LeBenchmark, specifically trained on 2.9K hours of French speech data. The dataset comprises a balanced mix of speakers (1.8K Males / 1.0K Females / 0.1K unknown), making it particularly robust for French speech processing tasks.

Implementation Details

This model implements the wav2vec2 large architecture, utilizing self-supervised learning techniques for speech representation. It's built using PyTorch and supports both Transformers and JAX frameworks, with F32 tensor type compatibility.

  • Architecture: Large wav2vec2 model with 317M parameters
  • Training Data: 2.9K hours of carefully curated French speech
  • Framework Support: PyTorch, JAX, and Transformers integration
  • Feature Extraction Capabilities: Optimized for French speech processing

Core Capabilities

  • Self-supervised speech representation learning
  • Integration with SpeechBrain for ASR, Speaker Recognition, and Source Separation
  • Fine-tuning support for specific downstream tasks
  • On-the-fly feature extraction with frozen encoder

Frequently Asked Questions

Q: What makes this model unique?

This model is part of LeBenchmark's comprehensive French speech processing framework, specifically optimized for French language understanding with a balanced speaker dataset and extensive pre-training.

Q: What are the recommended use cases?

The model excels in automatic speech recognition (ASR), speaker recognition, and source separation tasks. It can be used either as a feature extractor or fine-tuned for specific downstream tasks using frameworks like SpeechBrain.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.