L3-8B-Stheno-v3.2-GGUF-IQ-Imatrix
Property | Value |
---|---|
Parameter Count | 8.03B |
License | CC-BY-NC-4.0 |
Architecture | LLaMA3-based |
Quantization | GGUF-IQ-Imatrix |
What is L3-8B-Stheno-v3.2-GGUF-IQ-Imatrix?
L3-8B-Stheno-v3.2-GGUF-IQ-Imatrix is a sophisticated quantized variant of the Stheno-v3.2 model, built on the LLaMA3 architecture. This version represents an evolution from v3.1, incorporating both SFW and NSFW storywriting capabilities while maintaining strong performance in roleplay and conversational tasks.
Implementation Details
The model utilizes advanced quantization techniques, specifically optimized for 8GB VRAM GPUs using Q4_K_M-imat quantization achieving 4.89 BPW efficiency. The implementation includes refined hyperparameters and carefully curated training data from multiple sources.
- Integrated mixture of SFW/NSFW storywriting data from Gryphe's Opus-WritingPrompts
- Enhanced instruction/assistant-style data integration
- Improved roleplaying sample quality through manual filtering
- Optimized hyperparameters for reduced loss levels
Core Capabilities
- Balanced handling of SFW/NSFW content
- Enhanced storywriting and narration abilities
- Improved multi-turn coherency
- Better prompt and instruction adherence
- Support for up to 12288 context size
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its balanced approach to content generation, improved multi-turn coherency, and optimized performance on 8GB VRAM GPUs. The implementation of GGUF-IQ-Imatrix quantization makes it particularly efficient for resource-constrained environments.
Q: What are the recommended use cases?
The model excels in roleplay scenarios, creative writing, and conversational tasks. It's particularly well-suited for SillyTavern implementations and general narrative generation, with recommended temperature settings between 1.12-1.22 and Min-P of 0.075.