anime-whisper

Maintained By
litagin

Anime Whisper

PropertyValue
Parameter Count756M parameters
Model TypeAutomatic Speech Recognition
LicenseMIT
Base Modelkotoba-tech/kotoba-whisper-v2.0

What is anime-whisper?

Anime Whisper is a specialized Japanese speech recognition model designed specifically for anime and game voice acting. Fine-tuned on over 5,300 hours of anime-style voice data comprising 3.7 million files, it achieves superior performance in transcribing emotional and expressive speech typical in anime content.

Implementation Details

Built on the kotoba-whisper-v2.0 architecture, this model was trained using a two-phase approach: first training only the decoder while freezing the encoder, then fine-tuning the entire model. The training process utilized an H100 NVL GPU over approximately 11.2 days.

  • Achieves 13% Character Error Rate (CER) on anime domain testing
  • Handles non-verbal expressions like laughs, sighs, and stutters
  • Appropriate punctuation placement based on speech rhythm
  • High accuracy for emotional and expressive speech

Core Capabilities

  • Accurate transcription of Japanese anime-style voice acting
  • Faithful reproduction of non-verbal utterances
  • Natural punctuation placement
  • Reduced hallucination compared to general models
  • Efficient processing with 756M parameters

Frequently Asked Questions

Q: What makes this model unique?

The model excels in handling anime-style speech patterns, emotional expressions, and non-verbal utterances that other models typically struggle with. It maintains high accuracy while being relatively lightweight compared to larger speech recognition models.

Q: What are the recommended use cases?

This model is ideal for transcribing anime content, visual novels, and game voice acting. It's particularly effective for content with emotional delivery and non-standard speech patterns. However, it should be used without initial prompts as they can cause performance degradation.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.