SummLlama3-8B
Property | Value |
---|---|
Parameter Count | 8.03B |
Model Type | Summarization |
Base Model | Meta-Llama-3-8B-Instruct |
Tensor Type | BF16 |
Research Paper | Link |
What is SummLlama3-8B?
SummLlama3-8B is a specialized summarization model that leverages Direct Preference Optimization (DPO) with over 100K summarization feedback data points. Built on Llama3-8B-Instruct, it remarkably outperforms larger models like Llama3-70B-Instruct and GPT-4o while maintaining faster inference speeds.
Implementation Details
The model is trained across seven distinct domains, including four non-dialogue domains (News, Lifestyle, Report, Medical) and three dialogue domains (Daily Life, Interview, Meeting). It utilizes high-quality, multi-dimensional feedback generated by large language models instead of expensive human feedback.
- Training methodology: Direct Preference Optimization (DPO)
- Feedback dataset: 100K+ summarization examples
- Architecture: Based on Llama3-8B-Instruct with specialized training
Core Capabilities
- Faithfulness: 0.980 (human evaluation) - ensures accurate information representation
- Completeness: 0.697 (human evaluation) - captures all key information
- Conciseness: 0.959 (human evaluation) - produces focused, succinct summaries
- Superior performance across multiple domains and text formats
- Faster inference compared to larger models
Frequently Asked Questions
Q: What makes this model unique?
SummLlama3-8B stands out for its ability to generate human-preferred summaries while being significantly smaller than competitors. It achieves this through specialized training on multi-dimensional feedback and performs exceptionally well across various domains.
Q: What are the recommended use cases?
The model excels at summarizing both dialogue and non-dialogue content across seven domains. It's particularly effective for summarizing news articles, medical texts, interviews, meetings, and lifestyle content while maintaining high standards of faithfulness and conciseness.