Triviaqa-t5-base
Property | Value |
---|---|
Model Type | T5-base Question Answering |
Training Dataset | TriviaQA |
Exact Match Score | 17% |
Subset Match Score | 24.5% |
What is triviaqa-t5-base?
Triviaqa-t5-base is a specialized question-answering model built on the T5-base architecture, specifically designed for closed-book trivia question answering. Developed by deep-learning-analytics, this model demonstrates the capability to answer trivia questions without accessing external context, relying solely on knowledge encoded in its parameters during training.
Implementation Details
The model was trained for 135 epochs using a batch size of 32 and a learning rate of 1e-3. It processes input questions with a maximum length of 25 tokens and generates answers limited to 10 tokens. The base model was pre-trained on the Common Crawl (C4) dataset before being fine-tuned on TriviaQA.
- Built on T5-base architecture
- Trained for 135 epochs
- Batch size: 32
- Learning rate: 1e-3
- Maximum input length: 25 tokens
- Maximum output length: 10 tokens
Core Capabilities
- Closed-book question answering on trivia topics
- Direct answer generation without external context
- Efficient processing with constrained input/output lengths
- Achieves 17% Exact Match and 24.5% Subset Match scores
Frequently Asked Questions
Q: What makes this model unique?
This model's unique feature is its ability to answer trivia questions without referring to external documentation, making it ideal for quick Q&A applications where context retrieval isn't feasible.
Q: What are the recommended use cases?
The model is best suited for trivia applications, educational tools, and interactive question-answering systems where immediate responses are needed without the overhead of context processing.