lb-de-fr-en-pt-coqui-stt-models
Property | Value |
---|---|
Author | mbarnig |
Model Type | Speech-to-Text (STT) |
Framework | Coqui-STT v1.3.0 |
Demo | Available on HuggingFace Space |
What is lb-de-fr-en-pt-coqui-stt-models?
This is a groundbreaking multilingual automatic speech recognition (ASR) model that notably includes support for Luxembourgish, being only the second machine learning model to do so. The model was developed using Coqui-STT version 1.3.0 and features a custom-designed alphabet system supporting multiple European languages.
Implementation Details
The model was trained from scratch using a customized dataset (mbarnig/lb-2880-STT_CORPUS) and implements a specialized alphabet system that includes standard Latin characters plus diacritical marks (àáâäçèéëîôöûü). The training process involved both small (2,880 samples) and expanded (27,072 samples) datasets, with variations including data augmentation.
- Custom alphabet implementation for multilingual support
- Trained on both small and expanded datasets
- Integration with established language models from Coqui.ai
- Support for Luxembourgish, German, French, English, and Portuguese
Core Capabilities
- Multilingual speech recognition across five European languages
- Specialized support for Luxembourgish language processing
- Integration with various established datasets including Common Voice, Librispeech, and others
- Live inference demonstration available via HuggingFace space
Frequently Asked Questions
Q: What makes this model unique?
This model is particularly notable for being only the second machine learning model to support Luxembourgish language speech recognition, following the first model published by Pr Peter Gilles of the University of Luxembourg in May 2022.
Q: What are the recommended use cases?
The model is ideal for multilingual speech recognition applications, particularly those requiring support for Luxembourgish alongside major European languages. It's suitable for both academic research and practical applications in multilingual environments.