aspire-contextualsentence-multim-compsci

Maintained By
allenai

aspire-contextualsentence-multim-compsci

PropertyValue
AuthorAllen AI
PaperMulti-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity
Training Data1.2M computer science paper pairs
Performance (MAP)41.24 on CSFCube

What is aspire-contextualsentence-multim-compsci?

This is a specialized BERT-based model designed for fine-grained similarity matching between computer science papers. It represents documents using multiple contextual sentence vectors, obtained by averaging token representations of individual sentences while maintaining cross-attention information. The model uses Wasserstein Distance to measure document similarity and can align sentences between documents intelligently.

Implementation Details

The model processes paper titles and abstracts using a sophisticated multi-vector approach. It was trained using the Adam optimizer with a 2e-5 learning rate and 1000 warm-up steps, followed by linear decay. The training utilized 1.2 million co-cited paper pairs, with negative examples generated through in-batch sampling.

  • Contextual sentence vector generation through token averaging
  • Cross-attention preservation during encoding
  • Wasserstein Distance-based similarity matching
  • Sparse alignment learning between document sentences

Core Capabilities

  • Fine-grained document similarity assessment
  • Aspect-conditional document retrieval
  • Multiple sentence-to-sentence similarity matching
  • Specialized for computer science domain
  • Support for document and sentence classification (with fine-tuning)

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its ability to create multiple vectors per document while maintaining contextual relationships between sentences, allowing for more nuanced document similarity matching than traditional single-vector approaches. It significantly outperforms baseline models like SPECTER and all-mpnet-base-v2 on the CSFCube benchmark.

Q: What are the recommended use cases?

The model is ideal for tasks requiring fine-grained similarity matching between computer science papers, especially when specific aspects or sentences need to be matched. It's particularly useful for research paper retrieval, citation recommendation, and academic search applications within the computer science domain.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.