hubert-base-persian-speech-gender-recognition

hubert-base-persian-speech-gender-recognition

m3hrdadfi

Persian speech gender recognition model using HuBERT architecture. Achieves 98% accuracy with high precision for both male/female classification.

PropertyValue
LicenseApache 2.0
Authorm3hrdadfi
FrameworkPyTorch, Transformers
Datasetshemo

What is hubert-base-persian-speech-gender-recognition?

This is a specialized speech processing model designed for gender recognition in Persian speech using the HuBERT architecture. The model demonstrates exceptional accuracy with 98% overall performance in distinguishing between male and female voices. It leverages advanced speech processing capabilities to provide reliable gender classification for Persian audio inputs.

Implementation Details

The model is implemented using PyTorch and the Transformers library, built on the HuBERT architecture. It requires specific audio preprocessing steps including resampling and feature extraction. The implementation supports both CPU and GPU inference, with a straightforward prediction pipeline that outputs probability scores for male and female classifications.

  • Achieves 98% precision and recall for both male and female classifications
  • Utilizes the Wav2Vec2FeatureExtractor for audio processing
  • Supports various audio input formats through torchaudio
  • Includes built-in resampling capabilities

Core Capabilities

  • Binary gender classification (Male/Female) for Persian speech
  • High-accuracy predictions with confidence scores
  • Real-time audio processing and inference
  • Robust performance across different audio qualities

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Persian speech gender recognition, achieving remarkably high accuracy (98%) using the HuBERT architecture. Its specialized training on the shemo dataset makes it particularly effective for Persian language applications.

Q: What are the recommended use cases?

The model is ideal for applications requiring automated gender recognition from Persian speech, such as voice-based user interfaces, demographic analysis, and audio content classification systems. It can be integrated into larger speech processing pipelines or used standalone for gender classification tasks.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026