whisper-SV

whisper-SV

SebLih

A Swedish speech recognition model based on Whisper-small, fine-tuned on Common Voice 11.0 dataset with PyTorch, offering automated speech recognition capabilities.

PropertyValue
LicenseApache 2.0
FrameworkPyTorch 1.13.0
LanguageSwedish

What is whisper-SV?

Whisper-SV is a specialized Swedish speech recognition model that builds upon OpenAI's whisper-small architecture. It has been specifically fine-tuned on the Common Voice 11.0 dataset to enhance its performance for Swedish language processing. The model leverages the Transformers library and implements native AMP (Automatic Mixed Precision) training for optimal performance.

Implementation Details

The model was trained using carefully selected hyperparameters, including a learning rate of 1e-05 and a total train batch size of 16. Training utilized the Adam optimizer with betas=(0.9,0.999) and implemented a linear learning rate scheduler with 500 warmup steps.

  • Gradient accumulation steps: 2
  • Training steps: 200
  • Evaluation batch size: 8
  • Seed: 42

Core Capabilities

  • Swedish speech recognition
  • Integration with HuggingFace's ASR pipeline
  • Support for TensorBoard logging
  • Inference endpoint compatibility

Frequently Asked Questions

Q: What makes this model unique?

This model specializes in Swedish language processing, being fine-tuned specifically on Swedish speech data while leveraging the robust whisper-small architecture.

Q: What are the recommended use cases?

The model is particularly suited for Swedish automatic speech recognition tasks, transcription services, and applications requiring Swedish language audio processing.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026