OpenHermes-2.5-Mistral-7B-16k-GGUF

Maintained By
TheBloke

OpenHermes-2.5-Mistral-7B-16k-GGUF

PropertyValue
Parameter Count7.24B
Context Length16,000 tokens
LicenseApache 2.0
Base ArchitectureMistral-7B

What is OpenHermes-2.5-Mistral-7B-16k-GGUF?

OpenHermes 2.5 is a sophisticated fine-tuned version of the Mistral-7B model, enhanced with extended context length and optimized for both code and general tasks. This GGUF version, quantized by TheBloke, offers various compression levels for efficient deployment while maintaining high performance.

Implementation Details

The model utilizes the ChatML format for interactions and supports system prompts for consistent behavior across chat sessions. It was trained on 1,000,000 entries of primarily GPT-4 generated data and carefully curated open datasets.

  • Multiple quantization options from 2-bit to 8-bit (Q2_K to Q8_0)
  • Extended context length of 16k tokens
  • Supports GPU acceleration with layer offloading
  • Compatible with llama.cpp and various UI interfaces

Core Capabilities

  • Strong code generation with 50.7% HumanEval Pass@1
  • Enhanced performance on TruthfulQA (53.04%)
  • Improved AGIEval scores (43.07%)
  • Robust GPT4All benchmark performance (73.12%)

Frequently Asked Questions

Q: What makes this model unique?

The model combines extended context length, efficient quantization options, and balanced performance across both code and general tasks. It shows significant improvements over its predecessors while maintaining a relatively small parameter count.

Q: What are the recommended use cases?

The model excels in code generation, general question-answering, and complex reasoning tasks. It's particularly suitable for applications requiring extended context understanding and technical discussions.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.