Bespoke-MiniCheck-7B

Property	Value
Parameter Count	7.74B
Model Type	Text Classification
Base Model	internlm2_5-7b-chat
License	CC BY-NC 4.0
Paper	MiniCheck Paper

What is Bespoke-MiniCheck-7B?

Bespoke-MiniCheck-7B is a state-of-the-art fact-checking model developed by Bespoke Labs. It's designed to evaluate whether a given sentence is supported by a source document, returning a binary classification (0 or 1). The model represents a significant advancement in efficient fact-checking, capable of processing over 500 documents per minute on standard hardware.

Implementation Details

The model is fine-tuned on a carefully curated dataset of 35,000 examples, combining 21K ANLI examples with 14K synthetically-generated samples. The synthetic data was generated using Meta's Llama-3.1-405B-Instruct model and underwent rigorous quality control through Bespoke Labs' proprietary curation techniques.

Built on internlm2_5-7b-chat architecture
Supports automatic prefix caching for improved performance
Handles documents up to 32K tokens
Implements vLLM for accelerated inference

Core Capabilities

Binary classification of claim-document pairs
High-throughput processing (500+ docs/minute)
Support for batch processing
Efficient handling of long documents through chunking
State-of-the-art accuracy on the LLM-AggreFact benchmark

Frequently Asked Questions

Q: What makes this model unique?

The model achieves SOTA performance despite its relatively small size, thanks to high-quality data curation and efficient architecture. It combines speed with accuracy, making it practical for real-world applications.

Q: What are the recommended use cases?

The model is ideal for fact-checking applications, content verification systems, and automated document validation. It's particularly useful when dealing with large-scale document verification needs where both accuracy and speed are crucial.