hate_speech_nl

IMSyPP

Dutch hate speech classifier trained on 20k social media posts using BERTje. Categorizes text into 4 classes: acceptable, inappropriate, offensive, violent.

Property	Value
Author	IMSyPP
Base Model	BERTje
Training Data	20,000 social media posts
Model URL	Hugging Face

What is hate_speech_nl?

hate_speech_nl is a specialized natural language processing model designed specifically for detecting and classifying hate speech in Dutch social media content. Built on the foundation of BERTje, a Dutch-language BERT model, it has been trained on a diverse dataset of 20,000 social media posts from platforms including YouTube, Twitter, and Facebook.

Implementation Details

The model leverages the BERTje tokenizer for text preprocessing and implements a four-class classification system for content analysis. It was validated using an independent test set of 2,000 posts to ensure robust performance across different types of social media content.

Built on pre-trained BERTje architecture
Trained on multi-platform social media data
Uses specialized Dutch language tokenization
Implements four-tier classification system

Core Capabilities

Accurate classification of Dutch text into four distinct categories
Real-time analysis of social media content
Granular hate speech detection levels
Classification scale: acceptable (0), inappropriate (1), offensive (2), violent (3)

Frequently Asked Questions

Q: What makes this model unique?

The model's specialization in Dutch language content and its four-tier classification system make it particularly valuable for content moderation in Dutch-speaking regions. Its training on diverse social media platforms ensures broad applicability.

Q: What are the recommended use cases?

The model is ideal for social media platforms, content moderators, and researchers focusing on Dutch language content. It can be used for automated content filtering, research on online hate speech, and social media monitoring.