Hermes-2-Theta-Llama-3-8B-GGUF
Property | Value |
---|---|
Parameter Count | 8.03B |
Model Type | GGUF Quantized Language Model |
Architecture | Merged Llama-3 + Hermes 2 Pro |
Author | NousResearch |
Primary Use | Instruction Following & Function Calling |
What is Hermes-2-Theta-Llama-3-8B-GGUF?
Hermes-2-Theta is an experimental merged model that combines NousResearch's Hermes 2 Pro with Meta's Llama-3 Instruct model. This GGUF version enables efficient deployment with reduced memory requirements while maintaining strong performance across various tasks.
Implementation Details
The model utilizes ChatML format for structured dialogue and supports advanced features like function calling and JSON mode outputs. It can run with as little as 5GB VRAM when using 4-bit quantization.
- Achieves 72.59 average score on GPT4All benchmarks
- MT-Bench average score of 8.19
- Supports both regular chat and specialized function calling modes
- Implements flash attention 2 for improved performance
Core Capabilities
- Advanced instruction following with ChatML format
- Structured function calling with JSON responses
- High-performance reasoning and task completion
- Efficient memory usage through GGUF quantization
- Multi-turn dialogue capabilities
Frequently Asked Questions
Q: What makes this model unique?
The model uniquely combines the strengths of Hermes 2 Pro and Llama-3 through an innovative merging process, followed by additional RLHF training. It offers exceptional performance in both general conversation and specialized tasks like function calling.
Q: What are the recommended use cases?
The model excels in instruction following, structured outputs via JSON mode, function calling applications, and general conversational tasks. It's particularly suitable for applications requiring both strong reasoning and efficient resource usage.