twitter-roberta-base-offensive

cardiffnlp

RoBERTa-based model trained on 58M tweets for offensive language detection. Achieves 90.73% accuracy for non-offensive classification.

Property	Value
Author	cardiffnlp
Base Architecture	RoBERTa-base
Training Data	58M tweets
Benchmark	TweetEval
Paper	TweetEval benchmark (Findings of EMNLP 2020)

What is twitter-roberta-base-offensive?

twitter-roberta-base-offensive is a specialized language model designed for detecting offensive content in social media text. Built on the RoBERTa architecture, this model has been trained on an extensive dataset of 58 million tweets and fine-tuned specifically for offensive language identification using the TweetEval benchmark.

Implementation Details

The model utilizes the RoBERTa-base architecture and implements sophisticated text preprocessing, including special handling of usernames (@user) and URLs. It outputs binary classifications: offensive and not-offensive, with confidence scores for each category.

Pre-trained on a massive Twitter dataset
Fine-tuned specifically for offensive content detection
Implements smart preprocessing for social media text
Provides probability scores for classifications

Core Capabilities

Binary classification of text as offensive or not-offensive
Handles social media-specific content (mentions, URLs)
Real-time text analysis capability
High accuracy in non-offensive content detection (90.73% demonstrated)

Frequently Asked Questions

Q: What makes this model unique?

This model's uniqueness lies in its specialized training on Twitter data and its specific optimization for offensive language detection, making it particularly effective for social media content moderation.

Q: What are the recommended use cases?

The model is ideal for content moderation systems, social media platforms, online community management, and research applications requiring offensive language detection in social media contexts.