triviaqa-t5-base

Maintained By
deep-learning-analytics

Triviaqa-t5-base

PropertyValue
Model TypeT5-base Question Answering
Training DatasetTriviaQA
Exact Match Score17%
Subset Match Score24.5%

What is triviaqa-t5-base?

Triviaqa-t5-base is a specialized question-answering model built on the T5-base architecture, specifically designed for closed-book trivia question answering. Developed by deep-learning-analytics, this model demonstrates the capability to answer trivia questions without accessing external context, relying solely on knowledge encoded in its parameters during training.

Implementation Details

The model was trained for 135 epochs using a batch size of 32 and a learning rate of 1e-3. It processes input questions with a maximum length of 25 tokens and generates answers limited to 10 tokens. The base model was pre-trained on the Common Crawl (C4) dataset before being fine-tuned on TriviaQA.

  • Built on T5-base architecture
  • Trained for 135 epochs
  • Batch size: 32
  • Learning rate: 1e-3
  • Maximum input length: 25 tokens
  • Maximum output length: 10 tokens

Core Capabilities

  • Closed-book question answering on trivia topics
  • Direct answer generation without external context
  • Efficient processing with constrained input/output lengths
  • Achieves 17% Exact Match and 24.5% Subset Match scores

Frequently Asked Questions

Q: What makes this model unique?

This model's unique feature is its ability to answer trivia questions without referring to external documentation, making it ideal for quick Q&A applications where context retrieval isn't feasible.

Q: What are the recommended use cases?

The model is best suited for trivia applications, educational tools, and interactive question-answering systems where immediate responses are needed without the overhead of context processing.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.