Hallucination Evaluation Model (HHEM-2.1-Open)
Property | Value |
---|---|
Parameters | 110M |
License | Apache 2.0 |
Base Model | google/flan-t5-base |
Paper | RAGTruth Paper |
Tensor Type | F32 |
What is hallucination_evaluation_model?
HHEM-2.1-Open is an advanced model designed by Vectara for detecting hallucinations in Language Model outputs, particularly in Retrieval-Augmented Generation (RAG) applications. It evaluates the factual consistency between generated content and source materials, providing scores between 0 and 1 to indicate the level of hallucination.
Implementation Details
Built on the FLAN-T5 architecture, HHEM-2.1-Open processes pairs of premise and hypothesis texts to determine factual consistency. The model requires less than 600MB RAM and can process 2k-token inputs in approximately 1.5 seconds on standard CPU hardware.
- Unlimited context length capability
- Outperforms GPT-3.5-Turbo and GPT-4 on benchmark datasets
- Efficient resource utilization for production deployment
Core Capabilities
- Binary classification of hallucinated vs. consistent content
- Asymmetric evaluation of premise-hypothesis pairs
- High performance on RAGTruth-Summ (64.42% balanced accuracy) and RAGTruth-QA (74.28% balanced accuracy)
- Efficient processing of long-form content
Frequently Asked Questions
Q: What makes this model unique?
HHEM-2.1-Open stands out for its ability to process unlimited context length, unlike its predecessor's 512 token limit, while maintaining superior performance over larger models like GPT-4 in hallucination detection tasks.
Q: What are the recommended use cases?
The model is particularly suited for RAG applications where verifying the factual consistency of LLM-generated summaries against source documents is crucial. It's ideal for production environments where resource efficiency is important while maintaining high accuracy.