wav2vec2-xls-r-1b

Maintained By
facebook

Wav2Vec2-XLS-R-1B

PropertyValue
AuthorFacebook
Parameters1 Billion
LicenseApache-2.0
PaperResearch Paper
Languages Supported128 languages

What is wav2vec2-xls-r-1b?

Wav2Vec2-XLS-R-1B is Facebook's state-of-the-art multilingual speech processing model that represents a significant advancement in cross-lingual speech representation learning. Built on the wav2vec 2.0 architecture, this model contains 1 billion parameters and has been pre-trained on an impressive 436,000 hours of speech data across 128 languages.

Implementation Details

The model is built upon the wav2vec 2.0 framework and requires input speech to be sampled at 16kHz. It has been pre-trained on multiple datasets including VoxPopuli, MLS, CommonVoice, BABEL, and VoxLingua107, making it extremely versatile for various speech processing tasks.

  • Pre-trained on 436K hours of unlabeled speech data
  • Supports 128 languages including both high-resource and low-resource languages
  • Implements the wav2vec 2.0 objective for self-supervised learning
  • Requires 16kHz audio input sampling rate

Core Capabilities

  • Automatic Speech Recognition (ASR) with 20-33% relative error rate reduction
  • Speech Translation with 7.4 BLEU score improvement on CoVoST-2
  • Language Identification with state-of-the-art performance on VoxLingua107
  • Cross-lingual speech processing tasks

Frequently Asked Questions

Q: What makes this model unique?

This model is unique in its scale and multilingual capabilities. It's trained on the largest amount of publicly available speech data to date, covering 128 languages, and has demonstrated superior performance across various speech processing tasks compared to previous models.

Q: What are the recommended use cases?

The model is best suited for fine-tuning on downstream tasks such as Automatic Speech Recognition, Speech Translation, and Language Classification. It's particularly valuable for applications requiring multilingual speech processing capabilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.