EmTract-DistilBERT Emotion Detection Model
Property | Value |
---|---|
Author | vamossyd |
License | MIT |
Architecture | DistilBERT-base-uncased |
Paper | Available at SSRN 3975884 |
What is emtract-distilbert-base-uncased-emotion?
EmTract is a specialized emotion detection model fine-tuned on a comprehensive dataset of approximately 250,000 texts, categorizing emotions across seven distinct categories: neutral, happy, sad, anger, disgust, surprise, and fear. What makes this model particularly unique is its additional training on 10,000 hand-tagged messages from StockTwits, making it especially effective for analyzing emotions in financial social media content.
Implementation Details
The model utilizes DistilBERT architecture with specific training parameters: sequence length of 64, learning rate of 2e-5, batch size of 128, and training duration of 8 epochs. The training process involved two phases - initial training on the Unify Emotion Datasets followed by specialized fine-tuning on StockTwits data.
- Optimized for financial social media content analysis
- Trained on combined dataset of 250K general texts and 10K financial messages
- Supports seven distinct emotion categories
- Built on efficient DistilBERT architecture
Core Capabilities
- Emotion classification across seven categories
- Specialized analysis of financial social media content
- Efficient processing with DistilBERT architecture
- Evaluation metrics including accuracy, precision, recall, and F1-score
Frequently Asked Questions
Q: What makes this model unique?
This model's uniqueness lies in its specialized training for financial social media content, particularly its fine-tuning on StockTwits data, making it especially effective for analyzing emotions in financial discussions and market sentiment.
Q: What are the recommended use cases?
The model is particularly well-suited for analyzing emotional content in financial social media posts, market sentiment analysis, and research applications involving social media emotions and financial markets, such as IPO returns and earnings announcements.