OpenHermes-2-Mistral-7B

teknium

OpenHermes-2-Mistral-7B is a state-of-the-art Mistral-7B fine-tune trained on 900k GPT-4 entries, using ChatML format and achieving superior benchmark performance.

Property	Value
Base Model	Mistral-7B-v0.1
License	Apache 2.0
Training Data	900,000 GPT-4 generated entries
Format	ChatML

What is OpenHermes-2-Mistral-7B?

OpenHermes-2-Mistral-7B is a sophisticated fine-tuned language model built on the Mistral-7B architecture. Developed by Teknium, it represents a significant advancement in conversational AI, trained on an extensive dataset of 900,000 entries primarily generated by GPT-4. The model employs the ChatML format, enabling structured multi-turn dialogues with system-level instruction capabilities.

Implementation Details

The model utilizes extensive filtering of public datasets and converts all formats to ShareGPT, which is then transformed to use ChatML. This implementation allows for OpenAI endpoint compatibility and familiar interaction patterns for those experienced with ChatGPT API.

Implements ChatML prompt format for structured dialogue
Supports system-level instructions for consistent behavior
Compatible with OpenAI endpoint specifications
Available in multiple quantized versions (GPTQ, GGUF, AWQ)

Core Capabilities

Outperforms previous Nous & Hermes models in benchmarks
Achieves 72.68% on GPT4All benchmark
Demonstrates strong performance in reasoning tasks
Excels in multi-turn conversations with context retention
Supports role-playing and creative writing scenarios

Frequently Asked Questions

Q: What makes this model unique?

OpenHermes-2 stands out due to its implementation of ChatML format, extensive training on GPT-4 generated data, and superior benchmark performance compared to other Mistral-7B variants. It shows particular strength in maintaining context and following system-level instructions.

Q: What are the recommended use cases?

The model excels in conversational AI applications, programming assistance, creative writing, role-playing scenarios, and complex reasoning tasks. It's particularly well-suited for applications requiring structured dialogue and consistent personality across interactions.