lettucedect-large-modernbert-en-v1

Maintained By
KRLabsOrg

lettucedect-large-modernbert-en-v1

PropertyValue
OrganizationKRLabsOrg
ArchitectureModernBERT Large
TaskHallucination Detection
Context Length8192 tokens
LanguageEnglish
PaperarXiv:2502.17125

What is lettucedect-large-modernbert-en-v1?

LettuceDetect is a sophisticated transformer-based model specifically designed for hallucination detection in Retrieval-Augmented Generation (RAG) applications. Built on ModernBERT's large architecture, it excels at analyzing context-answer pairs to identify potential hallucinations in generated content. With its impressive F1 score of 79.22%, it outperforms many existing solutions including GPT-4 and is competitive with state-of-the-art models.

Implementation Details

The model leverages ModernBERT's extended context support of up to 8192 tokens, making it particularly effective for processing lengthy documents. It performs token-level classification to identify spans of text that aren't supported by the provided context, offering granular hallucination detection capabilities.

  • Token-level classification architecture for precise hallucination detection
  • Extended context support up to 8192 tokens
  • Trained on the RagTruth dataset
  • Outperforms prompt-based methods and many encoder-based models

Core Capabilities

  • Span-level hallucination detection with confidence scores
  • Processing of extensive context-answer pairs
  • Integration-ready for RAG applications
  • Support for multiple context documents
  • Python API for easy implementation

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to process long contexts (8192 tokens) combined with its token-level classification approach makes it particularly effective for detailed document analysis. Its performance metrics surpass many existing solutions while maintaining practical inference times.

Q: What are the recommended use cases?

The model is ideal for RAG applications where verifying the truthfulness of generated content against source documents is crucial. It's particularly useful in enterprise settings, content generation verification, and any scenario where hallucination detection is critical for maintaining information accuracy.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.