lb-de-fr-en-pt-coqui-vits-tts

mbarnig

Multilingual VITS-based TTS model supporting Luxembourgish, German, French, English, and Portuguese, trained on custom dataset with Coqui-TTS framework.

Property	Value
License	CC-BY-NC-SA-4.0
Author	mbarnig
Supported Languages	Luxembourgish, German, French, English, Portuguese
Framework	Coqui-TTS (v0.7.1)

What is lb-de-fr-en-pt-coqui-vits-tts?

This is a multilingual text-to-speech model built using the VITS architecture and trained on a custom dataset. The model specializes in synthesizing speech in five different languages without relying on phoneme conversion, using a specific character set for direct text-to-speech conversion.

Implementation Details

The model was trained from scratch using the Coqui-TTS multilingual VITS-model recipe. It processes raw text input using a defined character set including standard Latin characters and special characters for various European languages. The training was conducted without phoneme usage, making it more straightforward for direct text processing.

Custom character set including special characters (ß, à, á, â, ã, ä, ç, è, é, ê, ë, í, î, ï, ó, ô, õ, ö, ù, ú, û, ü)
Integrated punctuation support (!\'(),-.:;? )
Direct text-to-speech conversion without phoneme intermediary

Core Capabilities

Multilingual speech synthesis in five languages
Support for various European language characters
Live inference demonstration available
TensorBoard integration for training visualization

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its support of Luxembourgish alongside other major European languages, and its phoneme-free approach to speech synthesis. The custom dataset and character set make it particularly well-suited for European language processing.

Q: What are the recommended use cases?

The model is ideal for applications requiring multilingual text-to-speech capabilities, particularly in European contexts. It's especially useful for projects involving Luxembourgish, which is less commonly supported in TTS systems.