Bespoke-MiniCheck-7B
Property | Value |
---|---|
Parameter Count | 7.74B |
Model Type | Text Classification |
Base Model | internlm2_5-7b-chat |
License | CC BY-NC 4.0 |
Paper | MiniCheck Paper |
What is Bespoke-MiniCheck-7B?
Bespoke-MiniCheck-7B is a state-of-the-art fact-checking model developed by Bespoke Labs. It's designed to evaluate whether a given sentence is supported by a source document, returning a binary classification (0 or 1). The model represents a significant advancement in efficient fact-checking, capable of processing over 500 documents per minute on standard hardware.
Implementation Details
The model is fine-tuned on a carefully curated dataset of 35,000 examples, combining 21K ANLI examples with 14K synthetically-generated samples. The synthetic data was generated using Meta's Llama-3.1-405B-Instruct model and underwent rigorous quality control through Bespoke Labs' proprietary curation techniques.
- Built on internlm2_5-7b-chat architecture
- Supports automatic prefix caching for improved performance
- Handles documents up to 32K tokens
- Implements vLLM for accelerated inference
Core Capabilities
- Binary classification of claim-document pairs
- High-throughput processing (500+ docs/minute)
- Support for batch processing
- Efficient handling of long documents through chunking
- State-of-the-art accuracy on the LLM-AggreFact benchmark
Frequently Asked Questions
Q: What makes this model unique?
The model achieves SOTA performance despite its relatively small size, thanks to high-quality data curation and efficient architecture. It combines speed with accuracy, making it practical for real-world applications.
Q: What are the recommended use cases?
The model is ideal for fact-checking applications, content verification systems, and automated document validation. It's particularly useful when dealing with large-scale document verification needs where both accuracy and speed are crucial.