whisper-SV

Maintained By
SebLih

whisper-SV

PropertyValue
LicenseApache 2.0
FrameworkPyTorch 1.13.0
LanguageSwedish

What is whisper-SV?

Whisper-SV is a specialized Swedish speech recognition model that builds upon OpenAI's whisper-small architecture. It has been specifically fine-tuned on the Common Voice 11.0 dataset to enhance its performance for Swedish language processing. The model leverages the Transformers library and implements native AMP (Automatic Mixed Precision) training for optimal performance.

Implementation Details

The model was trained using carefully selected hyperparameters, including a learning rate of 1e-05 and a total train batch size of 16. Training utilized the Adam optimizer with betas=(0.9,0.999) and implemented a linear learning rate scheduler with 500 warmup steps.

  • Gradient accumulation steps: 2
  • Training steps: 200
  • Evaluation batch size: 8
  • Seed: 42

Core Capabilities

  • Swedish speech recognition
  • Integration with HuggingFace's ASR pipeline
  • Support for TensorBoard logging
  • Inference endpoint compatibility

Frequently Asked Questions

Q: What makes this model unique?

This model specializes in Swedish language processing, being fine-tuned specifically on Swedish speech data while leveraging the robust whisper-small architecture.

Q: What are the recommended use cases?

The model is particularly suited for Swedish automatic speech recognition tasks, transcription services, and applications requiring Swedish language audio processing.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.