whisper-small-gl

whisper-small-gl

mozilla-ai

A Whisper-small model finetuned on Galician language, achieving 13.68% WER, significantly improved from 40.81% baseline performance

PropertyValue
Base Modelopenai/whisper-small
Training Data35,141 Galician audio samples
Evaluation WER13.681%
Model SourceHugging Face

What is whisper-small-gl?

whisper-small-gl is a specialized speech-to-text model developed by Mozilla.ai, specifically optimized for the Galician language. It's based on OpenAI's Whisper-small architecture and has been finetuned on over 35,000 Galician audio samples from the Common Voice dataset version 17.0.

Implementation Details

The model represents a significant improvement over the baseline Whisper-small model for Galician language processing. Through careful finetuning, the Word Error Rate (WER) was reduced from 40.812% to 13.681%, while the loss decreased from 1.506 to 0.21. This enhancement was achieved using Mozilla.ai's speech-to-text-finetune Blueprint methodology.

  • Baseline performance: 40.812% WER, 1.506 loss
  • Finetuned performance: 13.681% WER, 0.21 loss
  • Training dataset: mozilla-foundation/common_voice_17_0

Core Capabilities

  • Accurate transcription of Galician speech
  • Significantly improved performance compared to the base model
  • Optimized for Galician language nuances and pronunciation
  • Suitable for production deployment in Galician-language applications

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized optimization for the Galician language, achieving a remarkable improvement in Word Error Rate compared to the baseline model. The substantial reduction in WER from 40.812% to 13.681% makes it particularly effective for Galician speech recognition tasks.

Q: What are the recommended use cases?

The model is ideal for applications requiring Galician speech transcription, including: automated subtitling systems, voice assistants for Galician speakers, transcription services for Galician media content, and academic or business applications requiring Galician speech-to-text capabilities.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026