twitter-roberta-base-offensive

Maintained By
cardiffnlp

twitter-roberta-base-offensive

PropertyValue
Authorcardiffnlp
Base ArchitectureRoBERTa-base
Training Data58M tweets
BenchmarkTweetEval
PaperTweetEval benchmark (Findings of EMNLP 2020)

What is twitter-roberta-base-offensive?

twitter-roberta-base-offensive is a specialized language model designed for detecting offensive content in social media text. Built on the RoBERTa architecture, this model has been trained on an extensive dataset of 58 million tweets and fine-tuned specifically for offensive language identification using the TweetEval benchmark.

Implementation Details

The model utilizes the RoBERTa-base architecture and implements sophisticated text preprocessing, including special handling of usernames (@user) and URLs. It outputs binary classifications: offensive and not-offensive, with confidence scores for each category.

  • Pre-trained on a massive Twitter dataset
  • Fine-tuned specifically for offensive content detection
  • Implements smart preprocessing for social media text
  • Provides probability scores for classifications

Core Capabilities

  • Binary classification of text as offensive or not-offensive
  • Handles social media-specific content (mentions, URLs)
  • Real-time text analysis capability
  • High accuracy in non-offensive content detection (90.73% demonstrated)

Frequently Asked Questions

Q: What makes this model unique?

This model's uniqueness lies in its specialized training on Twitter data and its specific optimization for offensive language detection, making it particularly effective for social media content moderation.

Q: What are the recommended use cases?

The model is ideal for content moderation systems, social media platforms, online community management, and research applications requiring offensive language detection in social media contexts.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.