L3-8B-Stheno-v3.1-GGUF-IQ-Imatrix

L3-8B-Stheno-v3.1-GGUF-IQ-Imatrix

Lewdiculous

8B parameter LLaMA-3-based roleplay model, optimized for 1-on-1 interactions. Features GGUF quantization and IMatrix compression. Strong personality handling and NSFW capable.

PropertyValue
Parameter Count8.03B
Base ModelLLaMA-3
LicenseCC-BY-NC-4.0
LanguageEnglish
QuantizationGGUF with IMatrix Compression

What is L3-8B-Stheno-v3.1-GGUF-IQ-Imatrix?

L3-8B-Stheno-v3.1 is a specialized roleplay AI model built on the LLaMA-3 architecture, featuring GGUF quantization and IMatrix compression for optimal performance. Developed by Sao10K and quantized by Lewdiculous, this model excels in one-on-one roleplay scenarios while maintaining capabilities for scenario management and story writing.

Implementation Details

The model implements a sophisticated quantization process using GGUF-IQ-Imatrix, optimized after the fixes from llama.cpp #6920. It supports context sizes up to 12288 tokens and is specifically designed for 8GB VRAM GPUs using the Q4_K_M-imat quantization at 4.89 BPW.

  • Built using Claude-3-Opus generated outputs and human-curated data
  • Implements Llama-3-Instruct prompting template
  • Features advanced character personality handling
  • Optimized for KoboldCpp deployment

Core Capabilities

  • Specialized in 1-on-1 roleplay interactions
  • Strong character personality maintenance
  • Scenario and RPG management support
  • Unique response generation with high variance
  • Context-aware storytelling
  • NSFW capability with appropriate prompting

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized roleplay capabilities, achieving over 1200 Elo on Chaiverse. It provides consistent character personalities and unique responses when regenerating answers, making it ideal for immersive roleplay experiences.

Q: What are the recommended use cases?

The model excels in one-on-one roleplay scenarios, character-driven interactions, and storytelling. It performs best with some token context in character cards and can handle both casual conversations and complex narrative scenarios. For optimal results, use with recommended samplers (Temperature 1.12-1.32, Min-P 0.075, Top-K 40).

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026