Airoboros LLaMA2 70B GPT4 GPTQ

Property	Value
Base Model	LLaMA 2 70B
Parameter Count	9.1B (Quantized)
License	Other + Meta License
Quantization	GPTQ (Multiple Options)

What is airoboros-l2-70B-gpt4-1.4.1-GPTQ?

This is a quantized version of the Airoboros LLaMA2 70B model, specifically optimized using GPTQ techniques to reduce model size while maintaining performance. The model is based on Meta's LLaMA2 architecture and has been fine-tuned using GPT-4 generated data to provide detailed, uncensored responses across a wide range of tasks.

Implementation Details

The model offers multiple quantization options, from 3-bit to 4-bit precision, with various group sizes and Act Order configurations. The main branch provides a 4-bit quantized version with Act Order enabled and no group size, optimized for lower VRAM requirements. The model uses a specific prompt template designed for chat interactions and maintains compatibility with various inference frameworks including ExLlama and text-generation-webui.

Multiple quantization options (3-bit to 4-bit)
Various group sizes (32g, 64g, 128g) for VRAM optimization
Wikitext dataset used for quantization
4096 sequence length support

Core Capabilities

Uncensored, detailed responses to user queries
Efficient deployment options for different hardware configurations
Compatible with popular inference frameworks
Optimized for both quality and memory efficiency

Frequently Asked Questions

Q: What makes this model unique?

This model stands out due to its combination of large-scale parameters (70B) with efficient quantization options, allowing deployment on consumer hardware while maintaining high-quality outputs. It's particularly notable for its uncensored response capability and flexible deployment options.

Q: What are the recommended use cases?

The model is well-suited for chat applications, text generation, and general language understanding tasks. It's particularly useful in scenarios where uncensored, detailed responses are needed, while working within hardware constraints through its various quantization options.

airoboros-l2-70B-gpt4-1.4.1-GPTQ