BERT Base Uncased SST-2 (Knowledge Distilled)
Property | Value |
---|---|
Author | Yoshitomo Matsubara |
Publication | EMNLP 2023 Workshop (NLP-OSS) |
GLUE Score | 78.9 |
Framework | torchdistill with Hugging Face |
What is bert-base-uncased-sst2_from_bert-large-uncased-sst2?
This model is a fine-tuned version of BERT-base-uncased specifically optimized for the SST-2 (Stanford Sentiment Treebank) sentiment analysis task. What makes it unique is its training approach - it uses knowledge distillation techniques with a fine-tuned BERT-large-uncased model serving as the teacher model.
Implementation Details
The implementation leverages torchdistill framework integrated with Hugging Face libraries, demonstrating a coding-free approach to knowledge distillation in NLP tasks. The training was conducted using Google Colab, making it accessible and reproducible.
- Uses knowledge distillation from BERT-large to BERT-base
- Fine-tuned specifically for sentiment analysis on SST-2 dataset
- Implements reproducible training pipeline using torchdistill
Core Capabilities
- Sentiment analysis on SST-2 dataset
- Efficient performance through knowledge distillation
- Balanced trade-off between model size and accuracy
- Achieved 78.9 GLUE score
Frequently Asked Questions
Q: What makes this model unique?
This model demonstrates how knowledge distillation can be effectively used to transfer knowledge from a larger BERT model to a smaller one while maintaining good performance. The implementation is notable for its use of torchdistill and its coding-free approach.
Q: What are the recommended use cases?
The model is specifically designed for sentiment analysis tasks, particularly those similar to the SST-2 dataset. It's ideal for applications requiring efficient sentiment classification while maintaining reasonable accuracy.