CapybaraHermes-2.5-Mistral-7B-GPTQ
Property | Value |
---|---|
Base Model | OpenHermes-2.5-Mistral-7B |
Parameter Count | 7B |
License | Apache 2.0 |
Quantization | GPTQ (4-bit and 8-bit variants) |
What is CapybaraHermes-2.5-Mistral-7B-GPTQ?
CapybaraHermes is a quantized version of a preference-tuned language model based on OpenHermes-2.5-Mistral-7B. The model was fine-tuned using Direct Preference Optimization (DPO) with Argilla's dpo-mix-7k dataset, resulting in improved multi-turn conversation capabilities and strong benchmark performance.
Implementation Details
The model is available in multiple GPTQ quantization variants, including 4-bit and 8-bit versions with different group sizes. The main branch features a 4-bit quantization with group size 128 and Act Order enabled, offering an optimal balance between VRAM efficiency and performance.
- Multiple quantization options (4-bit to 8-bit)
- Supports both Linux and Windows platforms
- Compatible with popular frameworks like text-generation-webui and ExLlama
- Uses ChatML prompt format
Core Capabilities
- Strong performance on MTBench with 7.903125 average score
- Improved multi-turn conversations compared to base model
- 43.8% on AGIEval and 73.35% on GPT4All benchmarks
- Efficient VRAM usage with various quantization options
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its improved multi-turn conversation capabilities, achieved through preference tuning with DPO. It outperforms both its base model and Mistral-7B-Instruct-v0.2 in MTBench second-turn scores.
Q: What are the recommended use cases?
The model is well-suited for conversational AI applications, particularly those requiring multi-turn interactions. It's optimized for deployment on consumer hardware through various quantization options while maintaining strong performance.