OpenHermes-2.5-Mistral-7B-16k-GGUF
Property | Value |
---|---|
Parameter Count | 7.24B |
Context Length | 16,000 tokens |
License | Apache 2.0 |
Base Architecture | Mistral-7B |
What is OpenHermes-2.5-Mistral-7B-16k-GGUF?
OpenHermes 2.5 is a sophisticated fine-tuned version of the Mistral-7B model, enhanced with extended context length and optimized for both code and general tasks. This GGUF version, quantized by TheBloke, offers various compression levels for efficient deployment while maintaining high performance.
Implementation Details
The model utilizes the ChatML format for interactions and supports system prompts for consistent behavior across chat sessions. It was trained on 1,000,000 entries of primarily GPT-4 generated data and carefully curated open datasets.
- Multiple quantization options from 2-bit to 8-bit (Q2_K to Q8_0)
- Extended context length of 16k tokens
- Supports GPU acceleration with layer offloading
- Compatible with llama.cpp and various UI interfaces
Core Capabilities
- Strong code generation with 50.7% HumanEval Pass@1
- Enhanced performance on TruthfulQA (53.04%)
- Improved AGIEval scores (43.07%)
- Robust GPT4All benchmark performance (73.12%)
Frequently Asked Questions
Q: What makes this model unique?
The model combines extended context length, efficient quantization options, and balanced performance across both code and general tasks. It shows significant improvements over its predecessors while maintaining a relatively small parameter count.
Q: What are the recommended use cases?
The model excels in code generation, general question-answering, and complex reasoning tasks. It's particularly suitable for applications requiring extended context understanding and technical discussions.