wav2vec2-FR-3K-large

Property	Value
Parameter Count	317M
License	Apache 2.0
Training Data	2.9K hours French speech
Paper	LeBenchmark 2.0 Paper

What is wav2vec2-FR-3K-large?

wav2vec2-FR-3K-large is a large-scale speech representation model developed by LeBenchmark, specifically trained on 2.9K hours of French speech data. The dataset comprises a balanced mix of speakers (1.8K Males / 1.0K Females / 0.1K unknown), making it particularly robust for French speech processing tasks.

Implementation Details

This model implements the wav2vec2 large architecture, utilizing self-supervised learning techniques for speech representation. It's built using PyTorch and supports both Transformers and JAX frameworks, with F32 tensor type compatibility.

Architecture: Large wav2vec2 model with 317M parameters
Training Data: 2.9K hours of carefully curated French speech
Framework Support: PyTorch, JAX, and Transformers integration
Feature Extraction Capabilities: Optimized for French speech processing

Core Capabilities

Self-supervised speech representation learning
Integration with SpeechBrain for ASR, Speaker Recognition, and Source Separation
Fine-tuning support for specific downstream tasks
On-the-fly feature extraction with frozen encoder

Frequently Asked Questions

Q: What makes this model unique?

This model is part of LeBenchmark's comprehensive French speech processing framework, specifically optimized for French language understanding with a balanced speaker dataset and extensive pre-training.

Q: What are the recommended use cases?

The model excels in automatic speech recognition (ASR), speaker recognition, and source separation tasks. It can be used either as a feature extractor or fine-tuned for specific downstream tasks using frameworks like SpeechBrain.