SciBERT-NLI

Property	Value
Training Time	~4 hours on NVIDIA Tesla P100
Base Model	allenai/scibert-scivocab-cased
STS Benchmark Score	74.50
Max Sequence Length	128

What is scibert-nli?

SciBERT-NLI is a specialized language model designed for scientific text analysis, built upon the SciBERT architecture and fine-tuned on SNLI and MultiNLI datasets. It utilizes the scivocab wordpiece vocabulary and implements average pooling with softmax loss to generate high-quality sentence embeddings specifically optimized for scientific content.

Implementation Details

The model was trained with specific parameters including a batch size of 64, 20,000 training steps, and 1,450 warmup steps. It employs lowercase text processing and maintains a maximum sequence length of 128 tokens. The training process was completed in approximately 4 hours using an NVIDIA Tesla P100 GPU.

Implements average pooling strategy
Uses softmax loss function
Maintains original scivocab vocabulary
Optimized for scientific text processing

Core Capabilities

Scientific sentence embedding generation
Natural language inference tasks
Scientific paper similarity analysis
Competitive performance with 74.50 score on STS benchmark

Frequently Asked Questions

Q: What makes this model unique?

This model combines SciBERT's scientific domain expertise with NLI training, making it particularly effective for scientific text understanding and comparison. Its performance (74.50 on STS benchmark) approaches that of general-purpose BERT models while maintaining scientific domain specialization.

Q: What are the recommended use cases?

The model is ideal for scientific paper retrieval, document similarity analysis, and scientific text classification tasks. It has been successfully implemented in applications like the Covid Papers Browser for research paper analysis.

scibert-nli