SciBERT-NLI
Property | Value |
---|---|
Training Time | ~4 hours on NVIDIA Tesla P100 |
Base Model | allenai/scibert-scivocab-cased |
STS Benchmark Score | 74.50 |
Max Sequence Length | 128 |
What is scibert-nli?
SciBERT-NLI is a specialized language model designed for scientific text analysis, built upon the SciBERT architecture and fine-tuned on SNLI and MultiNLI datasets. It utilizes the scivocab wordpiece vocabulary and implements average pooling with softmax loss to generate high-quality sentence embeddings specifically optimized for scientific content.
Implementation Details
The model was trained with specific parameters including a batch size of 64, 20,000 training steps, and 1,450 warmup steps. It employs lowercase text processing and maintains a maximum sequence length of 128 tokens. The training process was completed in approximately 4 hours using an NVIDIA Tesla P100 GPU.
- Implements average pooling strategy
- Uses softmax loss function
- Maintains original scivocab vocabulary
- Optimized for scientific text processing
Core Capabilities
- Scientific sentence embedding generation
- Natural language inference tasks
- Scientific paper similarity analysis
- Competitive performance with 74.50 score on STS benchmark
Frequently Asked Questions
Q: What makes this model unique?
This model combines SciBERT's scientific domain expertise with NLI training, making it particularly effective for scientific text understanding and comparison. Its performance (74.50 on STS benchmark) approaches that of general-purpose BERT models while maintaining scientific domain specialization.
Q: What are the recommended use cases?
The model is ideal for scientific paper retrieval, document similarity analysis, and scientific text classification tasks. It has been successfully implemented in applications like the Covid Papers Browser for research paper analysis.