Voice Safety Classifier
Property | Value |
---|---|
Author | Roblox |
Base Model | WavLM base plus |
Training Data | 2,374 hours of voice chat |
Model URL | HuggingFace |
What is voice-safety-classifier?
The voice-safety-classifier is a sophisticated AI model developed by Roblox for detecting and classifying toxic content in voice chat communications. Built upon the WavLM base plus architecture, this model represents a significant advancement in audio content moderation, capable of identifying multiple categories of policy violations simultaneously.
Implementation Details
The model was trained on an extensive dataset of 2,374 hours of voice chat audio clips, utilizing a synthetic data pipeline for multilabel classification. It produces outputs across six distinct categories: Profanity, DatingAndSexting, Racist, Bullying, Other, and NoViolation. The model achieves an impressive 94.48% binarized average precision across all toxicity classes.
- Built on WavLM base plus architecture
- Outputs n by 6 classification tensor
- Evaluated on 9,795 human-annotated samples
- Supports multiple violation detection simultaneously
Core Capabilities
- Profanity detection (49.95% of evaluation dataset)
- Dating and sexting identification (7.02% of dataset)
- Racist content detection (9.08% of dataset)
- Bullying recognition (12.82% of dataset)
- Clean content verification (42.73% of dataset)
- Other policy violations detection
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its comprehensive approach to voice chat moderation, utilizing a large-scale manually curated dataset and achieving high precision across multiple violation categories. Its ability to handle multilabel classification makes it particularly valuable for real-world applications.
Q: What are the recommended use cases?
The model is ideal for real-time voice chat moderation in gaming platforms, online communities, and educational environments where content safety is crucial. It can be implemented for automatic content filtering, user protection, and policy enforcement systems.