Rogue-Rose-103b-v0.2-AWQ

Property	Value
Parameter Count	103B
Context Length	4096 tokens
Quantization	4-bit AWQ
License	Llama 2
Model Size	54.40 GB

What is Rogue-Rose-103b-v0.2-AWQ?

Rogue-Rose-103b-v0.2-AWQ is a quantized version of Sophosympatheia's frankenmerge model, combining two custom 70B models into a powerful 103B parameter AI. This AWQ-quantized version maintains the original model's capabilities while reducing its size and memory requirements. The model features 120 layers and is specifically optimized for roleplay and storytelling applications.

Implementation Details

The model utilizes AWQ (Activation-aware Weight Quantization) technology, operating at 4-bit precision with a group size of 128. It's calibrated using the VMware Open Instruct dataset and supports a 4096-token context window. The implementation includes compatibility with various frameworks including text-generation-webui, vLLM, and Hugging Face's Transformers library.

4-bit AWQ quantization for efficient memory usage
Optimized for 4096 token context length
Compatible with major inference frameworks
54.40 GB model size
Supports Min-P sampling method

Core Capabilities

Advanced roleplay and storytelling performance
High-quality narrative generation
Effective response to detailed prompting
Strong context maintenance
Uncensored output generation
Performs well with high temperature settings when using Min-P sampling

Frequently Asked Questions

Q: What makes this model unique?

The model combines the power of a 103B parameter architecture with AWQ quantization, making it more accessible while maintaining high performance in roleplay and storytelling tasks. Its 120-layer architecture and specialized training make it particularly effective at maintaining narrative consistency and character personalities.

Q: What are the recommended use cases?

The model excels in creative writing, roleplay scenarios, and storytelling applications. It's particularly well-suited for applications requiring detailed character interactions and consistent narrative development. The model performs best with the Vicuna instruction format and responds well to detailed prompting.