stt_uk_citrinet_1024_gamma_0_25

Maintained By
nvidia

NVIDIA Streaming Citrinet 1024 (Ukrainian)

PropertyValue
Parameter Count141M
LicenseCC-BY-4.0
ArchitectureCitrinet-CTC
PaperCitrinet Paper
WER (Common Voice 10.0)5.02%

What is stt_uk_citrinet_1024_gamma_0_25?

This is a non-autoregressive speech recognition model specifically designed for Ukrainian language processing. Built on NVIDIA's Citrinet architecture, it's been fine-tuned from a pre-trained Russian model using Cross-Language Transfer Learning approach. The model processes 16kHz mono-channel audio and outputs transcribed text in lowercase Ukrainian alphabet.

Implementation Details

The model leverages the Citrinet-1024 architecture, trained for 1000 epochs using the NeMo toolkit. It was trained on 69 hours of validated Mozilla Common Voice Corpus 10.0 dataset, excluding dev and test data. The model employs a SentencePiece Unigram tokenizer with a vocabulary size of 1024.

  • Non-autoregressive architecture optimized for streaming
  • CTC loss/decoding implementation
  • Supports production deployment through NVIDIA Riva
  • Compatible with PyTorch framework

Core Capabilities

  • Real-time speech transcription in Ukrainian
  • Handles conventional speech patterns with high accuracy
  • Supports streaming applications
  • Integration with NVIDIA Riva for production deployments

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient implementation of Cross-Language Transfer Learning, leveraging knowledge from a Russian pre-trained model to achieve high accuracy in Ukrainian speech recognition. Its streaming capabilities and integration with NVIDIA Riva make it suitable for production environments.

Q: What are the recommended use cases?

The model is ideal for Ukrainian speech transcription tasks, particularly in applications requiring real-time processing. It's best suited for clear speech in standard Ukrainian, though performance may vary with technical terms or heavy accents.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.