Hermes-3-Llama-3.1-70B-FP8

Hermes-3-Llama-3.1-70B-FP8

NousResearch

Hermes-3 70B FP8 is a highly capable LLM using ChatML format, featuring function calling, JSON mode & advanced reasoning. Built on Llama 3.1.

PropertyValue
Parameter Count70.6B
Model TypeFP8 Quantized Language Model
Base ModelMeta-Llama-3.1-70B
LicenseLlama3
PaperTechnical Report

What is Hermes-3-Llama-3.1-70B-FP8?

Hermes-3-Llama-3.1-70B-FP8 is a NeuralMagic FP8 quantized version of the flagship Hermes 3 language model, specifically optimized for use with vLLM. This model represents the latest iteration in the Hermes series, featuring advanced capabilities in reasoning, roleplaying, and multi-turn conversations.

Implementation Details

The model uses ChatML as its prompt format, enabling structured multi-turn dialogue and system-level instructions. It supports both function calling and JSON mode for structured outputs, making it highly versatile for various applications.

  • Quantization: FP8 (E4M3) format for optimal performance
  • Architecture: Based on Llama 3.1 70B foundation model
  • Framework Compatibility: Optimized for vLLM deployment

Core Capabilities

  • Advanced agentic capabilities and improved reasoning
  • Enhanced roleplaying and multi-turn conversation handling
  • Powerful function calling with structured output capabilities
  • Long context coherence and improved code generation
  • User-aligned responses with strong steering capabilities

Frequently Asked Questions

Q: What makes this model unique?

This model stands out due to its FP8 quantization while maintaining high performance, competitive with Llama-3.1 Instruct models. It offers advanced function calling capabilities and structured output formats, making it particularly suitable for practical applications requiring precise control and structured responses.

Q: What are the recommended use cases?

The model excels in scenarios requiring structured dialogue, function calling, JSON outputs, and complex reasoning tasks. It's particularly well-suited for applications needing multi-turn conversations, roleplaying scenarios, and tasks requiring detailed technical responses or code generation.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026