L3-8B-Stheno-v3.1-GGUF-IQ-Imatrix

Lewdiculous

8B parameter LLaMA-3-based roleplay model, optimized for 1-on-1 interactions. Features GGUF quantization and IMatrix compression. Strong personality handling and NSFW capable.

Property	Value
Parameter Count	8.03B
Base Model	LLaMA-3
License	CC-BY-NC-4.0
Language	English
Quantization	GGUF with IMatrix Compression

What is L3-8B-Stheno-v3.1-GGUF-IQ-Imatrix?

L3-8B-Stheno-v3.1 is a specialized roleplay AI model built on the LLaMA-3 architecture, featuring GGUF quantization and IMatrix compression for optimal performance. Developed by Sao10K and quantized by Lewdiculous, this model excels in one-on-one roleplay scenarios while maintaining capabilities for scenario management and story writing.

Implementation Details

The model implements a sophisticated quantization process using GGUF-IQ-Imatrix, optimized after the fixes from llama.cpp #6920. It supports context sizes up to 12288 tokens and is specifically designed for 8GB VRAM GPUs using the Q4_K_M-imat quantization at 4.89 BPW.

Built using Claude-3-Opus generated outputs and human-curated data
Implements Llama-3-Instruct prompting template
Features advanced character personality handling
Optimized for KoboldCpp deployment

Core Capabilities

Specialized in 1-on-1 roleplay interactions
Strong character personality maintenance
Scenario and RPG management support
Unique response generation with high variance
Context-aware storytelling
NSFW capability with appropriate prompting

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized roleplay capabilities, achieving over 1200 Elo on Chaiverse. It provides consistent character personalities and unique responses when regenerating answers, making it ideal for immersive roleplay experiences.

Q: What are the recommended use cases?

The model excels in one-on-one roleplay scenarios, character-driven interactions, and storytelling. It performs best with some token context in character cards and can handle both casual conversations and complex narrative scenarios. For optimal results, use with recommended samplers (Temperature 1.12-1.32, Min-P 0.075, Top-K 40).