Pygmalion-13B-4bit-128g

Property	Value
Base Model	Pygmalion-13B
Quantization	4-bit GPTQ
Group Size	128
Format	SafeTensor
Source	Hugging Face

What is pygmalion-13b-4bit-128g?

Pygmalion-13B-4bit-128g is a quantized version of the original Pygmalion-13B model, optimized for efficient deployment while maintaining performance. This model uses 4-bit precision and 128-group size quantization through GPTQ CUDA, significantly reducing the model's memory footprint while preserving its core capabilities. Notable for its adult content generation capabilities, this model comes with explicit content warnings and age restrictions.

Implementation Details

The model was quantized using the GPTQ-for-LLaMa framework with specific optimizations including true-sequential processing and 128 group size parameters. It's stored in the safetensor format, providing additional security and efficiency benefits.

GPTQ CUDA quantization implementation
True-sequential processing enabled
128 group size optimization
SafeTensor format storage

Core Capabilities

Efficient memory usage through 4-bit quantization
Adult content generation capabilities
Maintains base model performance with reduced size
Compatible with standard LLaMA-based frameworks

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient 4-bit quantization while maintaining the capabilities of the full Pygmalion-13B model. The 128 group size optimization provides a balance between compression and performance.

Q: What are the recommended use cases?

The model is specifically designed for adult content generation and should only be used by adults in appropriate contexts. It's particularly suitable for deployment in environments where memory efficiency is crucial while maintaining generation quality.