SPECTER2 Aug2023 Refresh Base
Property | Value |
---|---|
Model Type | BERT-base-uncased with adapters |
License | Apache 2.0 |
Base Model | AllenAI SciBERT |
Paper | SciRepEval Paper |
What is specter2_aug2023refresh_base?
SPECTER2 is an advanced scientific document embedding model designed as the successor to SPECTER. This base model serves as the foundation for task-specific adapters, trained on over 6M scientific paper citation triplets. It's specifically engineered to generate high-quality embeddings from scientific papers' titles and abstracts.
Implementation Details
The model implements a BERT-based architecture with adapter support, trained in two stages: first as a base model on citation triplets, then with task-specific adapters for various scientific document tasks. It's optimized with parameters including batch sizes of 1024 for base training and 256 for adapter training, using fp16 precision.
- Trained on extensive citation datasets with over 6M triplets
- Supports multiple task formats through adapters: Classification, Regression, Proximity, and Adhoc Search
- Implements efficient training with warmup steps and specific learning rates
Core Capabilities
- Generate task-specific embeddings for scientific documents
- Process title and abstract combinations efficiently
- Support for multiple downstream tasks through adapter modules
- State-of-the-art performance on citation recommendation tasks
Frequently Asked Questions
Q: What makes this model unique?
SPECTER2 stands out for its adapter-based architecture that allows task-specific optimization while maintaining a robust base model. It achieves state-of-the-art performance on the SciRepEval benchmark and MDCR citation recommendation tasks.
Q: What are the recommended use cases?
The model is ideal for scientific document embedding tasks including paper classification, regression analysis, proximity-based tasks like link prediction, and ad-hoc search queries. Each task type is supported by specific adapters that can be loaded as needed.