patricide-12B-Unslop-Mell-v2
Property | Value |
---|---|
Parameter Count | 12B |
Model Type | Merged Language Model |
Architecture | NuSLERP merge |
Context Window | 20k tokens (recommended) |
Chat Template | ChatML |
Model URL | https://huggingface.co/redrix/patricide-12B-Unslop-Mell-v2 |
What is patricide-12B-Unslop-Mell-v2?
patricide-12B-Unslop-Mell-v2 is a sophisticated merged language model that combines TheDrummer's UnslopNemo-12B-v4 and inflatebot's MN-12B-Mag-Mell-R1 using the NuSLERP merge method. The model was specifically designed to balance anti-GPT characteristics with maintained intelligence, utilizing varying weight distributions across different layers.
Implementation Details
The model employs a unique weight distribution strategy, with UnslopNemo-12B-v4 weighted at [0.6, 0.5, 0.3, 0.5, 0.6] and MN-12B-Mag-Mell-R1 at [0.4, 0.5, 0.7, 0.5, 0.4]. It uses bfloat16 dtype and implements normalized parameters with int8 masking.
- Optimized for ChatML template format
- Context window recommendation of 20k tokens
- Supports Temperature-Last (1.0) and Min-P (0.1) sampling
- Available in static GGUF quantization formats
Core Capabilities
- Enhanced anti-GPT characteristics while maintaining model intelligence
- Stable performance in early testing scenarios
- Flexible sampling parameter support
- Union tokenizer implementation
- Compatible with multiple chat templates
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its balanced approach to maintaining anti-GPT characteristics while preserving model intelligence, achieved through careful weight distribution in the NuSLERP merge process.
Q: What are the recommended use cases?
The model is best suited for applications requiring ChatML format interaction, with context lengths up to 20k tokens. It's particularly effective when used with Temperature-Last of 1.0 and Min-P of 0.1 sampling parameters.