GPT4-x-Vicuna-13b-4bit

Property	Value
License	GPL
Base Model	Vicuna-13b-1.1
Quantization	4-bit GPTQ (Groupsize 128)
Training Data Size	~180k instructions

What is GPT4-x-Vicuna-13b-4bit?

GPT4-x-Vicuna-13b-4bit is a quantized language model developed by NousResearch, built upon the Vicuna-13b architecture. This model represents a significant advancement in accessible AI, offering GPT-4 level instruction following while maintaining a smaller footprint through 4-bit quantization. The model has been carefully fine-tuned on a diverse set of high-quality instruction datasets, including GPTeacher, Roleplay v2, WizardLM Uncensored, and the Nous Research Instruct Dataset.

Implementation Details

The model utilizes GPTQ 4-bit quantization with a groupsize of 128, significantly reducing the memory footprint while maintaining performance. Training was conducted on 8 A100-80GB GPUs for 5 epochs using Alpaca deepspeed training code. The model implements two prompt formats following the Alpaca structure, supporting both basic instruction-response pairs and instruction-input-response configurations.

Trained on approximately 180,000 GPT-4 generated instructions
Cleaned dataset removing OpenAI censorship patterns
Optimized for reduced memory usage through 4-bit quantization
Built on the robust Vicuna-13b-1.1 architecture

Core Capabilities

High-quality instruction following
Reduced censorship compared to base models
Efficient memory usage through quantization
Support for multiple prompt formats
Enhanced performance through specialized training datasets

Frequently Asked Questions

Q: What makes this model unique?

This model stands out due to its combination of GPT-4 quality instruction following, reduced censorship, and efficient 4-bit quantization, making it more accessible for deployment on consumer hardware while maintaining high performance.

Q: What are the recommended use cases?

The model is well-suited for instruction-following tasks, conversational AI applications, and scenarios requiring reduced censorship. It's particularly valuable for users seeking a balance between performance and resource efficiency.