keyphrase-generation-t5-small-inspec

Maintained By
ml6team

keyphrase-generation-t5-small-inspec

PropertyValue
Base ModelT5-small
Training DatasetInspec (2000 scientific papers)
Learning Rate5e-5
Training Epochs50

What is keyphrase-generation-t5-small-inspec?

This is a specialized keyphrase generation model based on T5-small architecture, fine-tuned on the Inspec dataset. It's designed to automatically extract and generate keyphrases from scientific papers, particularly in the domains of Computers, Control, and Information Technology. The model can identify both present keyphrases (explicitly mentioned in the text) and absent keyphrases (conceptually relevant but not directly mentioned).

Implementation Details

The model implements a text-to-text generation approach where input text is processed to output a semicolon-separated list of keyphrases. It utilizes the T5-small architecture with specialized training on scientific abstracts, achieving F1@M scores of 0.32 for extractive keyphrases and 0.07 for abstractive keyphrases.

  • Custom Text2TextGenerationPipeline implementation
  • Semicolon-delimited keyphrase output format
  • Optimized for scientific paper abstracts
  • Early stopping with patience of 1 epoch

Core Capabilities

  • Automatic keyphrase extraction from scientific texts
  • Generation of both present and absent keyphrases
  • Efficient processing of academic abstracts
  • Semantic understanding of scientific content
  • Batch processing capability for multiple documents

Frequently Asked Questions

Q: What makes this model unique?

This model specializes in scientific paper keyphrase generation, utilizing deep learning to capture semantic dependencies and context, going beyond traditional statistical methods. It's particularly effective for academic abstract analysis and can generate both present and absent keyphrases.

Q: What are the recommended use cases?

The model is best suited for processing scientific papers, particularly in Computer Science and Information Technology domains. It's specifically optimized for academic abstract analysis and shouldn't be used for general-domain text or non-English documents.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.