SPECTER2

Property	Value
Model Type	BERT-base-uncased + adapters
License	Apache 2.0
Paper	SciRepEval: A Multi-Format Benchmark
Authors	Amanpreet Singh, Mike D'Arcy, Arman Cohan, Doug Downey, Sergey Feldman

What is SPECTER2?

SPECTER2 is a sophisticated scientific document embedding model that succeeds the original SPECTER model. It's designed to generate task-specific embeddings for scientific papers using a combination of title and abstract information. The model utilizes adapter modules for different tasks like classification, regression, proximity, and ad-hoc search.

Implementation Details

The model is built on BERT-base-uncased architecture and trained on over 6M triplets of scientific paper citations. It employs a two-stage training process: first training the base model on citation triplets, then training task-specific adapters on the SciRepEval training tasks.

Base training uses batch size 1024, learning rate 2e-5, and 2 epochs
Adapter training uses batch size 256, learning rate 1e-4, and 6 epochs
Supports maximum input length of 512 tokens

Core Capabilities

Paper similarity and citation prediction
Document classification and regression tasks
Ad-hoc query search
Proximity-based paper retrieval

Frequently Asked Questions

Q: What makes this model unique?

SPECTER2's uniqueness lies in its adapter-based architecture that allows for task-specific optimizations while maintaining a strong base model trained on extensive citation data. It achieves state-of-the-art performance on various scientific document understanding tasks.

Q: What are the recommended use cases?

The model is recommended for tasks like academic paper similarity search, citation prediction, paper classification, and scientific literature retrieval. Different adapters should be used based on the specific task requirements.

specter2