Pygmalion-13B-SuperHOT-8K-GPTQ

Maintained By
TheBloke

Pygmalion-13B-SuperHOT-8K-GPTQ

PropertyValue
Base Model Size13B parameters
Quantization4-bit GPTQ
Context Length8192 tokens
Model HubHugging Face
AuthorTheBloke

What is Pygmalion-13B-SuperHOT-8K-GPTQ?

This model represents a significant advancement in conversational AI, combining PygmalionAI's dialogue-focused model with SuperHOT's extended context capabilities. It's a 4-bit quantized version that maintains high performance while reducing computational requirements, featuring an impressive 8K context window.

Implementation Details

The model uses GPTQ quantization with a group size of 128 for optimal accuracy-performance balance. It's specifically designed to work with ExLlama and AutoGPTQ backends, supporting variable context lengths of 4096 or 8192 tokens through compress_pos_emb settings.

  • 4-bit quantization with 128 group size
  • Supports up to 8K context length
  • Optimized for dialogue generation
  • Compatible with text-generation-webui

Core Capabilities

  • Enhanced conversational abilities from Pygmalion base model
  • Extended context handling up to 8K tokens
  • Efficient memory usage through 4-bit quantization
  • Persona-based dialogue generation
  • Support for both CPU and GPU inference

Frequently Asked Questions

Q: What makes this model unique?

This model uniquely combines Pygmalion's conversational capabilities with SuperHOT's extended context handling, all while maintaining efficiency through 4-bit quantization. The 8K context window is particularly notable, allowing for longer, more coherent conversations.

Q: What are the recommended use cases?

The model is specifically designed for fictional conversation and entertainment purposes. It excels in character-based dialogue generation and can maintain context over longer conversations thanks to its extended context window. However, it's important to note it's not fine-tuned for factual accuracy or safety-critical applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.