TQ2.5-14B-Sugarquill-v1
Property | Value |
---|---|
Parameter Count | 14.8B |
Model Type | Text Generation |
License | Apache-2.0 |
Base Model | arcee-ai/SuperNova-Medius |
Training Data | Erebus-87k, r_shortstories_24k |
What is TQ2.5-14B-Sugarquill-v1?
TQ2.5-14B-Sugarquill-v1 is an advanced language model specifically designed for creative writing and storytelling. Built upon the SuperNova-Medius architecture, this model has been fine-tuned on a carefully curated dataset of short stories, making it particularly adept at generating engaging prose with extended context handling capabilities.
Implementation Details
The model was trained for 2 epochs on approximately 18.7M tokens, utilizing rsLoRA training methodology on a 5x3090Ti workstation. It implements BF16 precision and uses the ChatML instruction format for interaction. Notable technical features include normalized punctuation and whitespace handling, with optimized training parameters using the paged_ademamix_8bit optimizer.
- Advanced sampling configuration with Temperature 0.8, Min-P 0.05, and Top-A 0.3
- 8192 token context window
- Implements flash attention and gradient checkpointing
- Uses Axolotl training framework with Liger optimization kernels
Core Capabilities
- High-quality prose generation and storytelling
- Strong instruction following abilities
- Support for both roleplay and story writing
- Extended context handling for longer narratives
- Flexible deployment through chat mode or raw completion
Frequently Asked Questions
Q: What makes this model unique?
The model combines the prose capabilities of SuperNova-Medius with enhanced story-writing abilities, offering a larger context window and improved punctuation handling. It's specifically optimized for creative writing while maintaining strong instruction-following capabilities.
Q: What are the recommended use cases?
The model excels in creative writing scenarios, including story generation, roleplay interactions, and collaborative writing. It can be used both in chat mode for interactive writing and raw completion for direct story generation.