Hallucination Evaluation Model (HHEM-2.1-Open)

Property	Value
Parameters	110M
License	Apache 2.0
Base Model	google/flan-t5-base
Paper	RAGTruth Paper
Tensor Type	F32

What is hallucination_evaluation_model?

HHEM-2.1-Open is an advanced model designed by Vectara for detecting hallucinations in Language Model outputs, particularly in Retrieval-Augmented Generation (RAG) applications. It evaluates the factual consistency between generated content and source materials, providing scores between 0 and 1 to indicate the level of hallucination.

Implementation Details

Built on the FLAN-T5 architecture, HHEM-2.1-Open processes pairs of premise and hypothesis texts to determine factual consistency. The model requires less than 600MB RAM and can process 2k-token inputs in approximately 1.5 seconds on standard CPU hardware.

Unlimited context length capability
Outperforms GPT-3.5-Turbo and GPT-4 on benchmark datasets
Efficient resource utilization for production deployment

Core Capabilities

Binary classification of hallucinated vs. consistent content
Asymmetric evaluation of premise-hypothesis pairs
High performance on RAGTruth-Summ (64.42% balanced accuracy) and RAGTruth-QA (74.28% balanced accuracy)
Efficient processing of long-form content

Frequently Asked Questions

Q: What makes this model unique?

HHEM-2.1-Open stands out for its ability to process unlimited context length, unlike its predecessor's 512 token limit, while maintaining superior performance over larger models like GPT-4 in hallucination detection tasks.

Q: What are the recommended use cases?

The model is particularly suited for RAG applications where verifying the factual consistency of LLM-generated summaries against source documents is crucial. It's ideal for production environments where resource efficiency is important while maintaining high accuracy.