roberta-base-cold

Maintained By
thu-coai

roberta-base-cold

PropertyValue
Parameter Count102M
Model TypeText Classification
ArchitectureRoBERTa (Chinese)
Research PaperLink
Performance82.75% accuracy, 82.39% macro-F1

What is roberta-base-cold?

roberta-base-cold is a specialized Chinese language model fine-tuned for detecting offensive content in text. Based on the Chinese RoBERTa architecture, it's specifically optimized using the COLDataset to identify and classify offensive language in Chinese text with high accuracy.

Implementation Details

The model is built upon the hfl/chinese-roberta-wwm-ext architecture and implements PyTorch for processing. It uses the BertTokenizer and BertForSequenceClassification components, making it straightforward to integrate into existing NLP pipelines.

  • Binary classification output (0 for Non-Offensive, 1 for Offensive)
  • Supports batch processing with padding
  • Implements PyTorch tensors for efficient computation
  • Uses safetensors for model storage

Core Capabilities

  • Chinese text classification for offensive content
  • High accuracy (82.75%) in detecting offensive language
  • Efficient processing with transformer architecture
  • Support for batch inference

Frequently Asked Questions

Q: What makes this model unique?

This model specializes in Chinese offensive language detection with state-of-the-art accuracy, making it particularly valuable for content moderation and social media analysis in Chinese-language contexts.

Q: What are the recommended use cases?

The model is ideal for content moderation systems, social media platforms, and research applications requiring Chinese text analysis for offensive content. It can be used for automated content filtering, research in online behavior, and social media monitoring.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.