CapybaraHermes-2.5-Mistral-7B
Property | Value |
---|---|
Parameter Count | 7.24B |
Base Model | OpenHermes-2.5-Mistral-7B |
License | Apache 2.0 |
Training Type | DPO Fine-tuning |
What is CapybaraHermes-2.5-Mistral-7B?
CapybaraHermes-2.5-Mistral-7B is an advanced language model that builds upon the OpenHermes-2.5-Mistral-7B architecture, enhanced through preference tuning using the innovative Distilabel technology. This model represents a significant advancement in multi-turn conversational capabilities, demonstrated by its impressive performance on various benchmarks.
Implementation Details
The model has been preference-tuned using LoRA and TRL for 3 epochs on argilla's dpo-mix-7k dataset. It excels particularly in multi-turn conversations, showing notable improvements in MTBench Second Turn scores compared to its base model.
- Achieves 85.45% accuracy on HellaSwag (10-Shot)
- 63.13% accuracy on MMLU (5-Shot)
- 59.29% accuracy on GSM8k (5-shot)
- Demonstrates strong performance in MTBench with an average score of 7.903125
Core Capabilities
- Enhanced multi-turn conversation handling
- Strong performance in reasoning tasks (ARC Challenge: 65.78%)
- Improved truthfulness (TruthfulQA: 56.91%)
- Robust common sense reasoning (Winogrande: 78.3%)
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its enhanced multi-turn conversation capabilities, achieved through preference tuning with the capybara-dpo dataset. It shows particular strength in maintaining context and coherence across multiple conversation turns.
Q: What are the recommended use cases?
This model is particularly well-suited for applications requiring extended dialogue interactions, complex reasoning tasks, and scenarios where maintaining conversation context is crucial. It performs well in both academic and practical applications, as evidenced by its strong benchmark scores.