YarnGPT
Property | Value |
---|---|
Developer | Saheedniyi |
Model Type | Text-to-Speech (TTS) |
Base Model | HuggingFaceTB/SmolLM2-360M |
Language Support | Nigerian-accented English |
GitHub Repository | YarnGPT Repository |
What is YarnGPT?
YarnGPT is an innovative text-to-speech model specifically designed to synthesize Nigerian-accented English speech. The model leverages pure language modeling techniques without relying on external adapters or complex architectures. It supports 11 distinct voices (6 male and 5 female) and can generate natural, culturally relevant speech for various applications.
Implementation Details
The model is trained using PyTorch and operates at 24kHz sample rate. It utilizes the WavTokenizer for audio processing and implements a sophisticated training regime with AdamW optimizer, linear scheduling with warmup, and specific hyperparameters optimization. The training was conducted on an A100 GPU over 5 epochs with a batch size of 4.
- Trained on Nigerian movies, podcasts, and open-source Nigerian audio data
- Implements automated preprocessing and audio resampling
- Uses advanced tokenization through WavTokenizer
- Features temperature and repetition penalty controls for output generation
Core Capabilities
- Generation of natural Nigerian-accented English speech
- Support for 11 distinct voice personas
- Real-time text-to-speech conversion
- Integration capabilities for news reading and content generation
- Customizable generation parameters
Frequently Asked Questions
Q: What makes this model unique?
YarnGPT stands out for its specific focus on Nigerian-accented English and its ability to generate natural speech without complex external adapters. It's one of the few models specifically trained for African English variants.
Q: What are the recommended use cases?
The model is ideal for experimental text-to-speech applications requiring Nigerian-accented English, including news reading, content creation, and educational materials. However, it's not suitable for generating speech in languages other than English or other accents.