bertweet-large

bertweet-large

vinai

BERTweet-large is a RoBERTa-based language model pre-trained on 850M English Tweets (16B tokens), optimized for Twitter-specific NLP tasks.

PropertyValue
LicenseMIT
AuthorVINAI
Downloads15,530
PaperView Paper

What is bertweet-large?

BERTweet-large is a state-of-the-art language model specifically designed for processing English Tweets. Built upon the RoBERTa architecture, it represents the first large-scale language model pre-trained exclusively on Twitter data, encompassing 850M English Tweets with 16B word tokens (approximately 80GB of data). The training corpus includes 845M general Tweets from 2012-2019 and 5M COVID-19 related Tweets.

Implementation Details

The model follows RoBERTa's pre-training methodology but is specifically optimized for Twitter's unique linguistic characteristics. It has demonstrated superior performance in various Twitter-specific NLP tasks, including part-of-speech tagging, named entity recognition, and sentiment analysis.

  • Pre-trained on 850M English Tweets (16B tokens)
  • Based on RoBERTa architecture
  • Includes COVID-19 specific data
  • Optimized for Twitter's linguistic patterns

Core Capabilities

  • Part-of-speech tagging
  • Named Entity Recognition (NER)
  • Sentiment Analysis
  • Irony Detection
  • Text Classification for Tweets

Frequently Asked Questions

Q: What makes this model unique?

BERTweet-large is the first large-scale language model specifically pre-trained for English Tweets, making it particularly effective for Twitter-specific NLP tasks. Its training on both historical and COVID-19 related tweets provides comprehensive coverage of Twitter language patterns.

Q: What are the recommended use cases?

The model is ideal for Twitter-specific tasks including sentiment analysis, named entity recognition, part-of-speech tagging, and irony detection. It's particularly useful for applications requiring deep understanding of Twitter's unique linguistic characteristics and social media content analysis.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026