dolphin-2.9.3-mistral-7B-32k

Maintained By
cognitivecomputations

Dolphin-2.9.3-Mistral-7B-32k

PropertyValue
Base ModelMistral-7B-v0.3
Context Length32,000 tokens
LicenseApache 2.0
Training Sequence Length8,192 tokens
Model URLcognitivecomputations/dolphin-2.9.3-mistral-7B-32k

What is dolphin-2.9.3-mistral-7B-32k?

Dolphin-2.9.3 is an advanced language model built on Mistral-7B-v0.3, specifically designed to provide enhanced capabilities in instruction following, conversation, and coding. Developed by Eric Hartford and Cognitive Computations, this uncensored model features an extended context window of 32k tokens and implements the ChatML prompt template format.

Implementation Details

The model was trained using the Axolotl framework (v0.4.0) with a sequence length of 8,192 tokens. It incorporates various training datasets including coding feedback, math problems, agent instructions, and multilingual content. The training configuration utilized AdamW 8-bit optimizer with cosine learning rate scheduling and gradient checkpointing for efficient training.

  • Implements flash attention and sample packing
  • Uses BF16 precision training
  • Trained with gradient accumulation steps of 16
  • Incorporates warmup steps and save checkpoints

Core Capabilities

  • Extended context processing (32k tokens)
  • Advanced coding and programming assistance
  • Function calling support
  • Initial agentic capabilities
  • Multilingual conversation handling
  • Math problem-solving abilities (4.83 score on MATH Lvl 5)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its uncensored nature and extensive context window, combined with strong performance across various tasks including coding, math, and conversational abilities. It achieved notable scores in evaluations like IFEval (41.26) and BBH (26.91).

Q: What are the recommended use cases?

This model is particularly suited for coding tasks, complex problem-solving, extended conversations, and scenarios requiring longer context understanding. However, users should implement their own alignment layer before deploying it as a service, as the model is uncensored and highly compliant with all requests.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.