CapybaraHermes-2.5-Mistral-7B-GGUF
Property | Value |
---|---|
Parameter Count | 7.24B |
Model Type | Mistral Architecture |
License | Apache 2.0 |
Language | English |
Author | Argilla (Quantized by TheBloke) |
What is CapybaraHermes-2.5-Mistral-7B-GGUF?
CapybaraHermes is a sophisticated language model based on OpenHermes-2.5-Mistral-7B, fine-tuned using Direct Preference Optimization (DPO) on high-quality datasets. This GGUF version, quantized by TheBloke, offers various compression levels for efficient deployment while maintaining performance.
Implementation Details
The model comes in multiple quantization formats ranging from 2-bit to 8-bit precision, offering different trade-offs between model size and performance. The recommended Q4_K_M variant provides a balanced compromise at 4.37GB file size.
- Implements ChatML prompt format for consistent dialogue handling
- Supports context lengths up to 32K tokens
- Multiple quantization options from Q2_K (2.72GB) to Q8_0 (7.70GB)
- GPU layer offloading capability for improved performance
Core Capabilities
- Strong performance in multi-turn conversations (MTBench score: 7.903125)
- Enhanced truthfulness compared to base model (TruthfulQA: 57.07)
- Improved performance on general knowledge tasks (AGIEval: 43.8)
- Excels in follow-up interactions (Second Turn MTBench: 7.5625)
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its exceptional performance in multi-turn conversations and its versatility in deployment options through various quantization levels. It's particularly notable for improving upon the base model's capabilities in follow-up interactions.
Q: What are the recommended use cases?
This model is well-suited for chatbots, conversational AI applications, and general text generation tasks. The various quantization options make it adaptable for different hardware configurations, from resource-constrained environments to high-performance systems.