bert-toxic-comment-classification
Property | Value |
---|---|
License | AFL-3.0 |
Language | English |
Framework | PyTorch / Transformers |
Downloads | 19,052 |
Performance | 0.95 AUC |
What is bert-toxic-comment-classification?
This model is a specialized BERT-based classifier designed to identify toxic comments in text. Built on the bert-base-uncased architecture, it has been fine-tuned using a large dataset from the Jigsaw Unintended Bias in Toxicity Classification competition on Kaggle. The model demonstrates impressive performance with a 0.95 AUC score on the test set.
Implementation Details
The implementation leverages the Transformers library and can be easily integrated into existing pipelines. The model uses BERT's sequence classification capabilities, with a binary classification head for toxic/non-toxic prediction.
- Built on bert-base-uncased architecture
- Uses TextClassificationPipeline for inference
- Trained on 90% of Jigsaw competition dataset
- Binary classification output (toxic/non-toxic)
Core Capabilities
- Toxic comment detection in English text
- High accuracy with 0.95 AUC performance
- Simple integration with Transformers pipeline
- Suitable for content moderation systems
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its high performance in toxic comment detection, achieved through careful fine-tuning on a large-scale dataset from Kaggle's Jigsaw competition. With over 19,000 downloads, it has proven its utility in real-world applications.
Q: What are the recommended use cases?
The model is ideal for content moderation systems, social media platforms, and any application requiring automatic detection of toxic or inappropriate comments. It can be easily integrated into existing pipelines using the Transformers library.