dolphin-2.9.4-llama3.1-8b

Maintained By
cognitivecomputations

Dolphin 2.9.4 LLaMA 3.1 8B

PropertyValue
Parameter Count8.03B
Base ModelMeta-LLaMA-3.1-8B
LicenseLLaMA 3.1
Context Length128K
Training Sequence Length8192

What is dolphin-2.9.4-llama3.1-8b?

Dolphin 2.9.4 is a sophisticated fine-tuned version of Meta's LLaMA 3.1 8B model, developed by Eric Hartford and Cognitive Computations. This model represents a significant advancement in instruction-following and conversational AI, trained on a diverse set of 9 carefully curated datasets including specialized mathematical, coding, and agent-based training data.

Implementation Details

The model utilizes the ChatML prompt template format and has been fine-tuned with specific optimizations for instruction following and coding tasks. It features a substantial 128K context window, though the fine-tuning process used an 8192 sequence length. The training implementation included advanced techniques such as gradient checkpointing and flash attention for optimal performance.

  • Trained using Axolotl version 0.4.1 with BF16 precision
  • Implements sophisticated attention mechanisms including flash attention
  • Uses cosine learning rate scheduler with warmup steps
  • Employs gradient accumulation for stable training

Core Capabilities

  • Strong instruction-following abilities in multiple languages
  • Advanced coding and mathematical problem-solving
  • Agentic capabilities and function calling support
  • Uncensored responses with filtered dataset alignment
  • Comprehensive evaluation performance across various benchmarks

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its combination of uncensored capabilities, strong instruction-following abilities, and specialized training across multiple domains. It's particularly notable for its extensive context window and optimized performance on both conversational and technical tasks.

Q: What are the recommended use cases?

The model excels in coding tasks, mathematical problem-solving, instruction following, and general conversational interactions. However, users should implement their own alignment layer before deploying it as a service, particularly due to its uncensored nature.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.