CapybaraHermes-2.5-Mistral-7B

Property	Value
Parameter Count	7.24B
Base Model	OpenHermes-2.5-Mistral-7B
License	Apache 2.0
Training Type	DPO Fine-tuning

What is CapybaraHermes-2.5-Mistral-7B?

CapybaraHermes-2.5-Mistral-7B is an advanced language model that builds upon the OpenHermes-2.5-Mistral-7B architecture, enhanced through preference tuning using the innovative Distilabel technology. This model represents a significant advancement in multi-turn conversational capabilities, demonstrated by its impressive performance on various benchmarks.

Implementation Details

The model has been preference-tuned using LoRA and TRL for 3 epochs on argilla's dpo-mix-7k dataset. It excels particularly in multi-turn conversations, showing notable improvements in MTBench Second Turn scores compared to its base model.

Achieves 85.45% accuracy on HellaSwag (10-Shot)
63.13% accuracy on MMLU (5-Shot)
59.29% accuracy on GSM8k (5-shot)
Demonstrates strong performance in MTBench with an average score of 7.903125

Core Capabilities

Enhanced multi-turn conversation handling
Strong performance in reasoning tasks (ARC Challenge: 65.78%)
Improved truthfulness (TruthfulQA: 56.91%)
Robust common sense reasoning (Winogrande: 78.3%)

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its enhanced multi-turn conversation capabilities, achieved through preference tuning with the capybara-dpo dataset. It shows particular strength in maintaining context and coherence across multiple conversation turns.

Q: What are the recommended use cases?

This model is particularly well-suited for applications requiring extended dialogue interactions, complex reasoning tasks, and scenarios where maintaining conversation context is crucial. It performs well in both academic and practical applications, as evidenced by its strong benchmark scores.