Bonito-v1

Property	Value
Base Model	Mistral-7B-v0.1
License	Apache 2.0
Paper	View Paper
Training Data	CTGA-v1 Dataset

What is bonito-v1?

Bonito-v1 is an innovative model designed to bridge the gap between unannotated text and task-specific training datasets for instruction tuning. Built on the Mistral-7B architecture, it specializes in conditional task generation - converting raw text into structured training data that can be used to adapt large language models for specialized tasks without requiring manual annotations.

Implementation Details

The model is implemented using Q-LoRA fine-tuning techniques and trained for 100,000 steps on four GPUs. It utilizes advanced hyperparameters including a Q-LoRA rank of 64, scaling factor of 4, and employs the Paged AdamW optimizer with a linear learning rate scheduler.

Maximum input and output length: 2,048 tokens
Effective batch size: 16
Maximum learning rate: 1e-04
Maximum gradient norm: 0.3

Core Capabilities

Generates synthetic instruction tuning datasets
Supports 16 different task types including summarization, sentiment analysis, and NLI
Enables zero-shot task adaptation
Optimized for both pretrained and instruction-tuned models

Frequently Asked Questions

Q: What makes this model unique?

Bonito-v1's unique ability to generate task-specific training datasets from unannotated text makes it particularly valuable for organizations with specialized or private data who need to adapt language models without manual annotation efforts.

Q: What are the recommended use cases?

The model is ideal for creating synthetic instruction tuning datasets in scenarios where manual annotation is impractical or costly. It's particularly effective for tasks like summarization, sentiment analysis, multiple-choice QA, and natural language inference, among others.

bonito-v1