visobert

visobert

uitnlp

ViSoBERT - State-of-the-art Vietnamese social media language model based on XLM-R architecture, optimized for tasks like sentiment analysis and hate speech detection.

PropertyValue
PaperEMNLP 2023
ArchitectureXLM-R based
Task TypeFill-Mask, Social Media Text Processing
LanguageVietnamese

What is visobert?

ViSoBERT is the first monolingual masked language model specifically designed for Vietnamese social media text processing. Published at EMNLP 2023, it represents a significant advancement in Vietnamese NLP, particularly for social media applications. Built on the XLM-R architecture, it's trained on a large-scale corpus of Vietnamese social media texts.

Implementation Details

The model utilizes the transformers library and requires SentencePiece for tokenization. It's implemented using PyTorch and follows the XLM-R architecture while being specifically optimized for Vietnamese language processing.

  • Pre-trained on diverse Vietnamese social media texts
  • Requires minimal dependencies (transformers and SentencePiece)
  • Supports both CPU and GPU inference

Core Capabilities

  • Emotion Recognition
  • Hate Speech Detection
  • Sentiment Analysis
  • Spam Reviews Detection
  • Hate Speech Spans Detection
  • Fill-Mask Task Support

Frequently Asked Questions

Q: What makes this model unique?

ViSoBERT is the first monolingual MLM specifically built for Vietnamese social media texts, outperforming previous monolingual, multilingual, and multilingual social media approaches on various downstream tasks.

Q: What are the recommended use cases?

The model is particularly suited for Vietnamese social media text analysis tasks including sentiment analysis, hate speech detection, emotion recognition, and spam detection. It's designed specifically for processing informal and social media content.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026