wav2vec2-FR-3K-large
Property | Value |
---|---|
Parameter Count | 317M |
License | Apache 2.0 |
Training Data | 2.9K hours French speech |
Paper | LeBenchmark 2.0 Paper |
What is wav2vec2-FR-3K-large?
wav2vec2-FR-3K-large is a large-scale speech representation model developed by LeBenchmark, specifically trained on 2.9K hours of French speech data. The dataset comprises a balanced mix of speakers (1.8K Males / 1.0K Females / 0.1K unknown), making it particularly robust for French speech processing tasks.
Implementation Details
This model implements the wav2vec2 large architecture, utilizing self-supervised learning techniques for speech representation. It's built using PyTorch and supports both Transformers and JAX frameworks, with F32 tensor type compatibility.
- Architecture: Large wav2vec2 model with 317M parameters
- Training Data: 2.9K hours of carefully curated French speech
- Framework Support: PyTorch, JAX, and Transformers integration
- Feature Extraction Capabilities: Optimized for French speech processing
Core Capabilities
- Self-supervised speech representation learning
- Integration with SpeechBrain for ASR, Speaker Recognition, and Source Separation
- Fine-tuning support for specific downstream tasks
- On-the-fly feature extraction with frozen encoder
Frequently Asked Questions
Q: What makes this model unique?
This model is part of LeBenchmark's comprehensive French speech processing framework, specifically optimized for French language understanding with a balanced speaker dataset and extensive pre-training.
Q: What are the recommended use cases?
The model excels in automatic speech recognition (ASR), speaker recognition, and source separation tasks. It can be used either as a feature extractor or fine-tuned for specific downstream tasks using frameworks like SpeechBrain.