L3-8B-Stheno-v3.1-GGUF-IQ-Imatrix

Maintained By
Lewdiculous

L3-8B-Stheno-v3.1-GGUF-IQ-Imatrix

PropertyValue
Parameter Count8.03B
Base ModelLLaMA-3
LicenseCC-BY-NC-4.0
LanguageEnglish
QuantizationGGUF with IMatrix Compression

What is L3-8B-Stheno-v3.1-GGUF-IQ-Imatrix?

L3-8B-Stheno-v3.1 is a specialized roleplay AI model built on the LLaMA-3 architecture, featuring GGUF quantization and IMatrix compression for optimal performance. Developed by Sao10K and quantized by Lewdiculous, this model excels in one-on-one roleplay scenarios while maintaining capabilities for scenario management and story writing.

Implementation Details

The model implements a sophisticated quantization process using GGUF-IQ-Imatrix, optimized after the fixes from llama.cpp #6920. It supports context sizes up to 12288 tokens and is specifically designed for 8GB VRAM GPUs using the Q4_K_M-imat quantization at 4.89 BPW.

  • Built using Claude-3-Opus generated outputs and human-curated data
  • Implements Llama-3-Instruct prompting template
  • Features advanced character personality handling
  • Optimized for KoboldCpp deployment

Core Capabilities

  • Specialized in 1-on-1 roleplay interactions
  • Strong character personality maintenance
  • Scenario and RPG management support
  • Unique response generation with high variance
  • Context-aware storytelling
  • NSFW capability with appropriate prompting

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized roleplay capabilities, achieving over 1200 Elo on Chaiverse. It provides consistent character personalities and unique responses when regenerating answers, making it ideal for immersive roleplay experiences.

Q: What are the recommended use cases?

The model excels in one-on-one roleplay scenarios, character-driven interactions, and storytelling. It performs best with some token context in character cards and can handle both casual conversations and complex narrative scenarios. For optimal results, use with recommended samplers (Temperature 1.12-1.32, Min-P 0.075, Top-K 40).

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.