bagel-34b-v0.2

bagel-34b-v0.2

jondurbin

34B parameter LLM fine-tuned on 29 diverse datasets, featuring multi-format prompting and specialized for creative writing & roleplay. Built on Yi-34b-200k base.

PropertyValue
Parameter Count34.4B
Base ModelYi-34b-200k
LicenseApache 2.0
Tensor TypeBF16

What is bagel-34b-v0.2?

Bagel-34b-v0.2 is an experimental fine-tuned version of the Yi-34b-200k model, specifically designed to excel in creative writing and roleplay applications. This model represents the SFT (Supervised Fine-Tuning) phase before DPO implementation, trained on an extensive collection of 29 diverse datasets ranging from conversation and coding to mathematics and emotional understanding.

Implementation Details

The model employs a unique multi-format prompting system, supporting four different prompt formats: Vicuna, Llama-2, Alpaca, and ChatML. Each instruction is converted into all four formats during training, effectively quadrupling the exposure to each training example. The training process utilizes a conservative approach with a single epoch and low learning rate to prevent overfitting.

  • Supports multiple prompt formats for enhanced flexibility
  • Trained on 29 carefully curated datasets
  • Implements decontamination using approximate nearest neighbor search
  • Optimized for creative and conversational tasks

Core Capabilities

  • Advanced creative writing and roleplay generation
  • Multi-lingual comprehension and response
  • Code generation in multiple programming languages
  • Mathematical problem solving
  • Emotional understanding and context awareness

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its multi-format prompt training approach, combined with an extensive and diverse training dataset collection. Unlike many other models, it maintains creative capabilities while incorporating technical knowledge from various domains.

Q: What are the recommended use cases?

The model excels in creative writing, roleplay scenarios, and general conversation. It's particularly well-suited for applications requiring a balance of creativity and technical knowledge, such as story writing, character interaction, and educational content generation.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026