BERT Base Uncased SST-2 (Knowledge Distilled)

Property	Value
Author	Yoshitomo Matsubara
Publication	EMNLP 2023 Workshop (NLP-OSS)
GLUE Score	78.9
Framework	torchdistill with Hugging Face

What is bert-base-uncased-sst2_from_bert-large-uncased-sst2?

This model is a fine-tuned version of BERT-base-uncased specifically optimized for the SST-2 (Stanford Sentiment Treebank) sentiment analysis task. What makes it unique is its training approach - it uses knowledge distillation techniques with a fine-tuned BERT-large-uncased model serving as the teacher model.

Implementation Details

The implementation leverages torchdistill framework integrated with Hugging Face libraries, demonstrating a coding-free approach to knowledge distillation in NLP tasks. The training was conducted using Google Colab, making it accessible and reproducible.

Uses knowledge distillation from BERT-large to BERT-base
Fine-tuned specifically for sentiment analysis on SST-2 dataset
Implements reproducible training pipeline using torchdistill

Core Capabilities

Sentiment analysis on SST-2 dataset
Efficient performance through knowledge distillation
Balanced trade-off between model size and accuracy
Achieved 78.9 GLUE score

Frequently Asked Questions

Q: What makes this model unique?

This model demonstrates how knowledge distillation can be effectively used to transfer knowledge from a larger BERT model to a smaller one while maintaining good performance. The implementation is notable for its use of torchdistill and its coding-free approach.

Q: What are the recommended use cases?

The model is specifically designed for sentiment analysis tasks, particularly those similar to the SST-2 dataset. It's ideal for applications requiring efficient sentiment classification while maintaining reasonable accuracy.

bert-base-uncased-sst2_from_bert-large-uncased-sst2