dolphin-2.5-mixtral-8x7b

Maintained By
cognitivecomputations

Dolphin 2.5 Mixtral 8x7b

PropertyValue
Parameter Count46.7B
Model TypeText Generation
ArchitectureMixtral-based
LicenseApache 2.0
Context Window16k tokens
Tensor TypeBF16

What is dolphin-2.5-mixtral-8x7b?

Dolphin 2.5 Mixtral 8x7b is an advanced language model built on the Mixtral architecture, specifically designed for enhanced coding capabilities and general text generation. Trained on 8 carefully curated datasets, this model represents a significant evolution in the Dolphin series, incorporating new datasets like Synthia, OpenHermes, and PureDove while removing previous datasets like Samantha and WizardLM.

Implementation Details

The model was trained over 1.5 epochs using qLoRA and Axolotl on 4 A100 GPUs, taking approximately 3 days to complete. It utilizes the ChatML prompt format and requires trust_remote_code for operation. The training infrastructure was sponsored by Convai, demonstrating a collaborative effort in the AI community.

  • Built on Mixtral-8x7b architecture with 46.7B parameters
  • 16k context window implementation
  • Trained using qLoRA and Axolotl framework
  • Implements ChatML prompt format
  • Incorporates 8 specialized datasets

Core Capabilities

  • Advanced coding assistance and generation
  • Uncensored and unbiased responses
  • High compliance with user requests
  • Structured output generation
  • Enhanced conversational abilities
  • Comprehensive programming language support

Frequently Asked Questions

Q: What makes this model unique?

The model's primary distinguishing feature is its enhanced coding capabilities, combined with its uncensored nature and high compliance. It's built on the powerful Mixtral architecture and trained with a diverse set of high-quality datasets, making it particularly effective for both coding and general-purpose tasks.

Q: What are the recommended use cases?

The model excels in coding tasks, technical writing, and general conversational interactions. It's particularly suitable for developers, technical writers, and users requiring detailed technical assistance. However, due to its uncensored nature, implementing an alignment layer is recommended before deployment in production environments.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.