Control-Nanuq-8B
Property | Value |
---|---|
Base Model | LLaMA 3.1 8B Supernova |
Training Infrastructure | 4x RTX 3090s, T4, H100 |
Model Format | GGUF, EXL2 |
Author | Delta-Vector |
What is Control-Nanuq-8B?
Control-Nanuq-8B is a specialized language model fine-tuned from LLaMA 3.1 8B Supernova, designed to deliver concise and creative responses. Named after the Arctic-dwelling tyrannosaur Nanuqsaurus, this model incorporates advanced training techniques including DPO (Direct Preference Optimization) and KTO (Knowledge Transfer Optimization) to enhance its coherence and creative capabilities.
Implementation Details
The model underwent a comprehensive training process involving 4 epochs of fine-tuning using OpenCAI and RP logs, followed by DPO enhancement and KTO reinforcement learning. The training infrastructure utilized multiple GPU configurations, including 4x RTX 3090s for initial fine-tuning, a T4 for DPO, and an H100 for KTO implementation.
- Supports LLama-Instruct and ChatML formatting
- Implements advanced system prompting options (Euryale and EVA)
- Available in both GGUF and EXL2 formats
- Optimized for narrative and roleplay scenarios
Core Capabilities
- Concise and focused response generation
- Enhanced creative writing and storytelling
- Flexible prompt formatting support
- Improved coherence through DPO optimization
- Enhanced prose quality via KTO learning
Frequently Asked Questions
Q: What makes this model unique?
Control-Nanuq-8B stands out for its "short and sweet" approach to responses, combining multiple optimization techniques (DPO and KTO) to deliver high-quality, concise outputs while maintaining creative capability.
Q: What are the recommended use cases?
The model is particularly well-suited for roleplay scenarios, creative writing, and applications requiring concise yet engaging responses. It excels in narrative-driven interactions while maintaining efficiency in output length.