Dolphin-2.0-Mistral-7B

Property	Value
Base Model	Mistral 7B
Training Time	48 hours (10 epochs)
Hardware	4x A100s
License	Commercial-use friendly
Author	cognitivecomputations

What is dolphin-2.0-mistral-7b?

Dolphin-2.0-mistral-7b is an advanced language model built on the Mistral 7B architecture, sponsored by a16z. It's an uncensored, commercial-friendly model that implements Microsoft's Orca approach with significant modifications. The model achieves an impressive average score of 55.85 on standard benchmarks, showcasing strong performance across various tasks.

Implementation Details

The model utilizes the ChatML prompt format and was trained on a custom dataset combining the Dolphin dataset (an open-source implementation of Microsoft's Orca) and the Airoboros dataset for enhanced creativity. The training process involved 10 epochs over 48 hours using 4 A100 GPUs.

Uncensored training approach with filtered dataset to remove alignment and bias
Custom dataset combining Dolphin and Airoboros
ChatML prompt format implementation
Highly compliant response generation

Core Capabilities

Strong performance on ARC (59.22%) and HellaSwag (80.26%)
Robust truthfulness scoring (TruthfulQA: 61.09%)
Advanced reasoning capabilities (Winogrande: 75.37%)
Mathematical problem-solving (GSM8K: 18.65%)
Multilingual understanding (MMLU: 56.9%)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its uncensored nature and commercial-friendly license, while maintaining strong performance across various benchmarks. It combines the benefits of Mistral's architecture with enhanced training data and methodology.

Q: What are the recommended use cases?

The model is suitable for both commercial and non-commercial applications, but users are advised to implement their own alignment layer before deployment as a service. It's particularly effective for tasks requiring creative responses and complex reasoning, but should be used responsibly given its uncensored nature.