wav2vec2-xlsr-persian-speech-emotion-recognition

Maintained By
m3hrdadfi

wav2vec2-xlsr-persian-speech-emotion-recognition

PropertyValue
LicenseApache 2.0
Authorm3hrdadfi
Downloads47,405
DatasetShEMO

What is wav2vec2-xlsr-persian-speech-emotion-recognition?

This is a specialized speech emotion recognition model designed specifically for Persian (Farsi) language, built on the Wav2Vec 2.0 XLSR architecture. The model can identify six distinct emotions: Anger, Fear, Happiness, Neutral, Sadness, and Surprise, with an impressive overall accuracy of 90%.

Implementation Details

The model utilizes the Wav2Vec 2.0 architecture with XLSR (Cross-Lingual Speech Representations) adaptations for Persian speech. It processes audio input through a feature extractor and provides emotion classification probabilities as output. The implementation achieves particularly strong performance in detecting Anger (95% F1-score) and Neutral states (93% F1-score).

  • Built on PyTorch framework with Transformers integration
  • Includes custom feature extraction pipeline
  • Supports standard audio processing libraries (torchaudio, librosa)
  • Provides probability scores for each emotion category

Core Capabilities

  • Real-time emotion classification from Persian speech
  • High accuracy for anger detection (95% precision)
  • Robust neutral speech recognition (91% precision)
  • Support for multiple audio input formats
  • Easy integration with existing audio processing pipelines

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Persian speech emotion recognition, offering state-of-the-art performance across six emotional states. Its architecture leverages the power of Wav2Vec 2.0 while being adapted for Persian language characteristics.

Q: What are the recommended use cases?

The model is ideal for Persian speech analysis applications, including sentiment analysis systems, automated customer service evaluation, and emotional intelligence research. It's particularly effective in scenarios requiring accurate detection of anger and neutral emotional states.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.