mms-tts-yor

Maintained By
facebook

MMS-TTS-YOR: Yoruba Text-to-Speech Model

PropertyValue
DeveloperFacebook (Meta AI)
LicenseCC-BY-NC 4.0
Model TypeVITS (Variational Inference with adversarial learning for TTS)
PaperScaling Speech Technology to 1,000+ Languages (2023)

What is mms-tts-yor?

MMS-TTS-YOR is a specialized text-to-speech model designed for the Yoruba language, developed as part of Facebook's Massively Multilingual Speech (MMS) project. This model utilizes the VITS architecture to provide end-to-end speech synthesis capabilities, converting Yoruba text into natural-sounding speech.

Implementation Details

The model implements a conditional variational autoencoder (VAE) architecture with three main components: a posterior encoder, decoder, and conditional prior. It utilizes a Transformer-based text encoder combined with flow-based modules for spectrogram prediction, followed by HiFi-GAN vocoder-style transposed convolutional layers for final audio generation.

  • Stochastic duration predictor for varied speech rhythms
  • Flow-based module with coupling layers
  • End-to-end training with variational lower bound and adversarial losses
  • Non-deterministic output requiring seed fixing for reproducibility

Core Capabilities

  • Direct text-to-speech synthesis for Yoruba language
  • Variable speech rhythm generation
  • High-quality spectrogram-based acoustic feature prediction
  • Integrated with 🤗 Transformers library (v4.33+)

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically trained for Yoruba language speech synthesis and uses a sophisticated VITS architecture that allows for natural variation in speech patterns through its stochastic duration predictor.

Q: What are the recommended use cases?

The model is ideal for applications requiring Yoruba language text-to-speech conversion, such as accessibility tools, educational software, or automated voice systems. It's particularly useful when natural-sounding speech variation is desired.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.