Dolphin-2.9.3-Mistral-7B-32k

Property	Value
Base Model	Mistral-7B-v0.3
Context Length	32,000 tokens
License	Apache 2.0
Training Sequence Length	8,192 tokens
Model URL	cognitivecomputations/dolphin-2.9.3-mistral-7B-32k

What is dolphin-2.9.3-mistral-7B-32k?

Dolphin-2.9.3 is an advanced language model built on Mistral-7B-v0.3, specifically designed to provide enhanced capabilities in instruction following, conversation, and coding. Developed by Eric Hartford and Cognitive Computations, this uncensored model features an extended context window of 32k tokens and implements the ChatML prompt template format.

Implementation Details

The model was trained using the Axolotl framework (v0.4.0) with a sequence length of 8,192 tokens. It incorporates various training datasets including coding feedback, math problems, agent instructions, and multilingual content. The training configuration utilized AdamW 8-bit optimizer with cosine learning rate scheduling and gradient checkpointing for efficient training.

Implements flash attention and sample packing
Uses BF16 precision training
Trained with gradient accumulation steps of 16
Incorporates warmup steps and save checkpoints

Core Capabilities

Extended context processing (32k tokens)
Advanced coding and programming assistance
Function calling support
Initial agentic capabilities
Multilingual conversation handling
Math problem-solving abilities (4.83 score on MATH Lvl 5)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its uncensored nature and extensive context window, combined with strong performance across various tasks including coding, math, and conversational abilities. It achieved notable scores in evaluations like IFEval (41.26) and BBH (26.91).

Q: What are the recommended use cases?

This model is particularly suited for coding tasks, complex problem-solving, extended conversations, and scenarios requiring longer context understanding. However, users should implement their own alignment layer before deploying it as a service, as the model is uncensored and highly compliant with all requests.