llama2_70b_chat_uncensored-GPTQ

Maintained By
TheBloke

Llama2 70B Chat Uncensored GPTQ

PropertyValue
Parameter Count70B
LicenseLLaMA 2
PaperQLoRA (arxiv:2305.14314)
Base ModelLLaMA 2 70B

What is llama2_70b_chat_uncensored-GPTQ?

This is a GPTQ-quantized version of the Llama2 70B Chat Uncensored model, specifically optimized for efficient deployment while maintaining performance. The model was fine-tuned using the uncensored Wizard-Vicuna conversation dataset, designed to provide direct and unfiltered responses while maintaining factual accuracy.

Implementation Details

The model offers multiple quantization options, including 3-bit and 4-bit versions with various group sizes, allowing users to balance between VRAM usage and model accuracy. The implementation supports different branches for specific deployment scenarios, from minimal VRAM requirements to maximum inference quality.

  • Multiple GPTQ parameter permutations available (3-bit to 4-bit)
  • Group size options ranging from None to 128g
  • Compatible with AutoGPTQ, Transformers, and ExLlama (4-bit versions)
  • Customizable inference parameters for temperature and sampling

Core Capabilities

  • Straightforward, unfiltered responses to queries
  • Efficient memory usage through quantization
  • Support for context window of 4096 tokens
  • Flexible deployment options for different hardware configurations

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its combination of large-scale capabilities (70B parameters) with uncensored training, while being optimized for practical deployment through GPTQ quantization. It provides direct, unfiltered responses without the excessive safety constraints of standard LLaMA 2 chat models.

Q: What are the recommended use cases?

The model is suitable for applications requiring direct and unfiltered language model responses, while still maintaining factual accuracy. It's particularly useful in scenarios where standard language models might be overly cautious or patronizing in their responses.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.