MN-Slush

Maintained By
crestf411

MN-Slush

PropertyValue
Base ModelMistral-Nemo-Instruct-2407
Training MethodTwo-stage LoRA with TIES merge
Context Size16384 tokens
Recommended SettingsTemperature: 1.0, Min-P: 0.1, DRY: 0.8

What is MN-Slush?

MN-Slush is an advanced language model that implements a novel two-stage training approach, specifically designed to enhance creative writing and roleplaying capabilities. Built upon the Mistral-Nemo-Instruct-2407 architecture, it utilizes high LoRA dropout training techniques to improve model generalization and creativity.

Implementation Details

The model employs a sophisticated training pipeline consisting of two distinct stages. Stage 1 focuses on pretraining continuation with high LoRA dropout (0.5), utilizing LoRA+ technology with a rank of 64 and alpha of 128. Stage 2 implements fine-tuning with modified parameters (rank 32, alpha 64) to enhance roleplaying capabilities while maintaining model stability.

  • Implements TIES merge method for optimal model combination
  • Uses bfloat16 precision for efficient computation
  • Trained on 6 specialized datasets for comprehensive language understanding
  • Features 16384 token context window

Core Capabilities

  • Enhanced creative writing and storytelling
  • Advanced roleplaying interactions
  • Improved text generation consistency
  • High-context understanding and maintenance

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive two-stage training approach with high LoRA dropout rates, combined with the TIES merge method, creates a unique balance between creativity and coherence. The implementation of LoRA+ technology with specific learning rate ratios further enhances its capabilities.

Q: What are the recommended use cases?

MN-Slush is particularly well-suited for creative writing, roleplaying scenarios, and interactive storytelling. It performs optimally with the Silly Tavern preset, specifically designed for Mistral V2 & V3 implementations.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.