aspire-contextualsentence-multim-compsci

aspire-contextualsentence-multim-compsci

allenai

BERT-based multi-vector model for fine-grained computer science paper similarity using contextual sentence embeddings and Wasserstein Distance matching

PropertyValue
AuthorAllen AI
PaperMulti-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity
Training Data1.2M computer science paper pairs
Performance (MAP)41.24 on CSFCube

What is aspire-contextualsentence-multim-compsci?

This is a specialized BERT-based model designed for fine-grained similarity matching between computer science papers. It represents documents using multiple contextual sentence vectors, obtained by averaging token representations of individual sentences while maintaining cross-attention information. The model uses Wasserstein Distance to measure document similarity and can align sentences between documents intelligently.

Implementation Details

The model processes paper titles and abstracts using a sophisticated multi-vector approach. It was trained using the Adam optimizer with a 2e-5 learning rate and 1000 warm-up steps, followed by linear decay. The training utilized 1.2 million co-cited paper pairs, with negative examples generated through in-batch sampling.

  • Contextual sentence vector generation through token averaging
  • Cross-attention preservation during encoding
  • Wasserstein Distance-based similarity matching
  • Sparse alignment learning between document sentences

Core Capabilities

  • Fine-grained document similarity assessment
  • Aspect-conditional document retrieval
  • Multiple sentence-to-sentence similarity matching
  • Specialized for computer science domain
  • Support for document and sentence classification (with fine-tuning)

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its ability to create multiple vectors per document while maintaining contextual relationships between sentences, allowing for more nuanced document similarity matching than traditional single-vector approaches. It significantly outperforms baseline models like SPECTER and all-mpnet-base-v2 on the CSFCube benchmark.

Q: What are the recommended use cases?

The model is ideal for tasks requiring fine-grained similarity matching between computer science papers, especially when specific aspects or sentences need to be matched. It's particularly useful for research paper retrieval, citation recommendation, and academic search applications within the computer science domain.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026