Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix

Property	Value
Parameter Count	8.03B
License	CC-BY-NC-4.0
Format	GGUF with IQ-Imatrix Quantization
Authors	Undi, IkariDev

What is Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix?

This is a specialized variant of the Llama-3 architecture, optimized for roleplay applications with Orthogonal Activation Steering (OAS) treatment. The model features a carefully balanced training composition of 40% non-roleplay and 60% roleplay/ERP data, making it versatile while maintaining specific roleplay capabilities.

Implementation Details

The model implements GGUF quantization with IQ-Imatrix optimization, specifically designed for efficient deployment on systems with 8GB VRAM. It utilizes the standard Llama-3 prompting format and incorporates multiple high-quality training datasets including Aesir, NoRobots, LimaRP, and specialized Luminae datasets.

Updated V2 implementation with imatrix data from FP16 and BF16 conversions
Requires KoboldCpp version 1.64 or higher
Supports context sizes up to 12288 with Q4_K_M-imat quantization

Core Capabilities

Enhanced response compliance through OAS treatment
Balanced roleplay and general conversation abilities
Efficient memory usage with optimized quantization
Comprehensive training on diverse datasets

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its Orthogonal Activation Steering treatment, which significantly reduces request refusal while maintaining coherent responses. Additionally, its balanced training data composition ensures versatility in both roleplay and general interactions.

Q: What are the recommended use cases?

This model is primarily designed for roleplay applications in platforms like SillyTavern, with specific optimization for 8GB VRAM systems. It's particularly suitable for users seeking a balance between roleplay capabilities and general conversation ability.