Bonito-v1
Property | Value |
---|---|
Base Model | Mistral-7B-v0.1 |
License | Apache 2.0 |
Paper | View Paper |
Training Data | CTGA-v1 Dataset |
What is bonito-v1?
Bonito-v1 is an innovative model designed to bridge the gap between unannotated text and task-specific training datasets for instruction tuning. Built on the Mistral-7B architecture, it specializes in conditional task generation - converting raw text into structured training data that can be used to adapt large language models for specialized tasks without requiring manual annotations.
Implementation Details
The model is implemented using Q-LoRA fine-tuning techniques and trained for 100,000 steps on four GPUs. It utilizes advanced hyperparameters including a Q-LoRA rank of 64, scaling factor of 4, and employs the Paged AdamW optimizer with a linear learning rate scheduler.
- Maximum input and output length: 2,048 tokens
- Effective batch size: 16
- Maximum learning rate: 1e-04
- Maximum gradient norm: 0.3
Core Capabilities
- Generates synthetic instruction tuning datasets
- Supports 16 different task types including summarization, sentiment analysis, and NLI
- Enables zero-shot task adaptation
- Optimized for both pretrained and instruction-tuned models
Frequently Asked Questions
Q: What makes this model unique?
Bonito-v1's unique ability to generate task-specific training datasets from unannotated text makes it particularly valuable for organizations with specialized or private data who need to adapt language models without manual annotation efforts.
Q: What are the recommended use cases?
The model is ideal for creating synthetic instruction tuning datasets in scenarios where manual annotation is impractical or costly. It's particularly effective for tasks like summarization, sentiment analysis, multiple-choice QA, and natural language inference, among others.