Wizard-Vicuna-7B-Uncensored-GGML
Property | Value |
---|---|
Base Model Size | 7B parameters |
License | Other |
Author | TheBloke |
Quantization Options | 2-bit to 8-bit |
What is Wizard-Vicuna-7B-Uncensored-GGML?
Wizard-Vicuna-7B-Uncensored-GGML is a specialized conversion of Eric Hartford's Wizard Vicuna model into GGML format, designed for efficient CPU and GPU inference. This model is unique as it removes alignment and moralizing constraints from the training data, allowing for more flexible deployment with custom alignment approaches.
Implementation Details
The model is available in multiple quantization levels ranging from 2-bit to 8-bit, offering different tradeoffs between model size, RAM usage, and inference quality. It supports various frameworks including llama.cpp, text-generation-webui, KoboldCpp, and others.
- Multiple quantization options (q2_K through q8_0)
- RAM requirements ranging from 5.30GB to 9.66GB
- Optimized for both CPU and GPU inference
- Compatible with major inference frameworks
Core Capabilities
- Efficient text generation with controllable parameters
- Flexible deployment options across different hardware configurations
- Support for context window of 2048 tokens
- Custom prompt template support
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its uncensored nature and variety of quantization options, allowing users to choose the optimal balance between model size and performance for their specific use case.
Q: What are the recommended use cases?
The model is ideal for applications requiring unrestricted text generation with custom alignment approaches, particularly in resource-constrained environments where efficient CPU/GPU inference is necessary.