YarnGPT2

Property	Value
Developer	saheedniyi
Model Type	Text-to-Speech (TTS)
Base Model	HuggingFaceTB/SmolLM2-360M
Training Infrastructure	1x A100 GPU
Repository	Hugging Face

What is YarnGPT2?

YarnGPT2 is an innovative text-to-speech model specifically designed to synthesize Nigerian-accented languages. The model utilizes pure language modeling techniques without relying on external adapters or complex architectures, making it a streamlined solution for generating natural, culturally authentic speech across multiple Nigerian languages including English, Yoruba, Igbo, and Hausa.

Implementation Details

The model was trained using PyTorch on publicly available Nigerian movies, podcasts, and open-source Nigerian-related audio data. It employs a WavTokenizer for audio processing, with audio files resampled to 24KHz. The training process involved 5 epochs with a batch size of 4, using AdamW optimizer and a linear schedule with warmup.

Sampling rate: 24KHz
Multiple voice options for each supported language
Temperature and repetition penalty controls for output customization
Integrated with standard transformers pipeline

Core Capabilities

Nigerian-accented English synthesis
Native language support for Yoruba, Igbo, and Hausa
Multiple voice options per language
High-quality, natural-sounding speech output
Culturally relevant pronunciation and intonation

Frequently Asked Questions

Q: What makes this model unique?

YarnGPT2 stands out for its specialized focus on Nigerian languages and accents, offering a comprehensive solution for generating authentic Nigerian speech without complex architectural requirements.

Q: What are the recommended use cases?

The model is ideal for generating Nigerian-accented English speech for experimental purposes, content localization, and educational applications. However, it's not suitable for generating speech in languages outside its trained scope or other accents.

YarnGPT2

YarnGPT2

What is YarnGPT2?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models