Nous-Hermes-13b

NousResearch

Fine-tuned 13B parameter LLaMA model trained on 300k+ instructions, rivals GPT-3.5-turbo with long responses and low hallucination rate

Property	Value
Base Model	LLaMA 13B
License	GPL
Training Data	300,000+ instructions
Training Duration	50+ hours
Hardware Used	8x A100 80GB DGX

What is Nous-Hermes-13b?

Nous-Hermes-13b is a sophisticated language model developed through collaborative efforts between Nous Research, Teknium, and Karan4D. It represents a significant advancement in AI language modeling, built upon the LLaMA 13B architecture and fine-tuned with over 300,000 carefully curated instructions. The model has demonstrated performance levels comparable to GPT-3.5-turbo across various benchmarks.

Implementation Details

The model's training process involved extensive fine-tuning using synthetic GPT-4 outputs, including data from diverse sources such as GPTeacher, roleplay datasets, code instruct datasets, and various specialized instruction sets. The training was conducted with a 2000 sequence length configuration on high-performance hardware.

Trained primarily on GPT-4 generated content
Incorporates specialized datasets for biology, physics, chemistry, and mathematics
Uses Alpaca prompt format for interaction
Available in FP16 format with planned GGML and GPTQ 4bit quantizations

Core Capabilities

Exceptional performance in ARC-c, ARC-e, and Hellaswag benchmarks
Long-form response generation with low hallucination rates
Robust instruction following capabilities
Advanced code generation and scientific reasoning

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its ability to generate long, coherent responses while maintaining a low hallucination rate. It operates without OpenAI's censorship mechanisms and has achieved top rankings in multiple benchmark categories.

Q: What are the recommended use cases?

The model is suitable for a wide range of applications including creative text generation, complex instruction following, code generation, and scientific reasoning tasks. It can be implemented in chatbots, discord bots, and various text generation applications.