SPECTER2
Property | Value |
---|---|
Model Type | BERT-base-uncased + adapters |
License | Apache 2.0 |
Paper | SciRepEval: A Multi-Format Benchmark |
Authors | Amanpreet Singh, Mike D'Arcy, Arman Cohan, Doug Downey, Sergey Feldman |
What is SPECTER2?
SPECTER2 is a sophisticated scientific document embedding model that succeeds the original SPECTER model. It's designed to generate task-specific embeddings for scientific papers using a combination of title and abstract information. The model utilizes adapter modules for different tasks like classification, regression, proximity, and ad-hoc search.
Implementation Details
The model is built on BERT-base-uncased architecture and trained on over 6M triplets of scientific paper citations. It employs a two-stage training process: first training the base model on citation triplets, then training task-specific adapters on the SciRepEval training tasks.
- Base training uses batch size 1024, learning rate 2e-5, and 2 epochs
- Adapter training uses batch size 256, learning rate 1e-4, and 6 epochs
- Supports maximum input length of 512 tokens
Core Capabilities
- Paper similarity and citation prediction
- Document classification and regression tasks
- Ad-hoc query search
- Proximity-based paper retrieval
Frequently Asked Questions
Q: What makes this model unique?
SPECTER2's uniqueness lies in its adapter-based architecture that allows for task-specific optimizations while maintaining a strong base model trained on extensive citation data. It achieves state-of-the-art performance on various scientific document understanding tasks.
Q: What are the recommended use cases?
The model is recommended for tasks like academic paper similarity search, citation prediction, paper classification, and scientific literature retrieval. Different adapters should be used based on the specific task requirements.