WizardLM-7B-uncensored-GGML
Property | Value |
---|---|
Author | TheBloke (GGML conversion) / Eric Hartford (Original) |
Model Size | 7B parameters |
License | Other |
Format | GGML (various quantizations) |
What is WizardLM-7B-uncensored-GGML?
WizardLM-7B-uncensored-GGML is a collection of quantized versions of Eric Hartford's WizardLM 7B Uncensored model, optimized for CPU and GPU inference using llama.cpp. The model is unique in that it was trained without alignment constraints, allowing users to implement their own alignment strategies separately.
Implementation Details
This repository provides multiple quantized versions ranging from 2-bit to 8-bit precision, offering different trade-offs between model size, performance, and accuracy. The quantization methods include both traditional llama.cpp methods (q4_0, q4_1, q5_0, q5_1, q8_0) and new k-quant methods (q2_K through q6_K).
- File sizes range from 2.80GB (q2_K) to 7.16GB (q8_0)
- RAM requirements range from 5.30GB to 9.66GB
- Compatible with various frameworks including text-generation-webui, KoboldCpp, and llama-cpp-python
Core Capabilities
- Efficient inference on both CPU and GPU
- Multiple quantization options for different hardware constraints
- Uncensored responses without built-in alignment
- Support for context window of 2048 tokens
- Compatible with major GGML-based frameworks
Frequently Asked Questions
Q: What makes this model unique?
This model provides uncensored outputs without built-in alignment, allowing users to implement their own ethical guidelines. It's available in multiple quantization formats optimized for different hardware configurations and use cases.
Q: What are the recommended use cases?
The model is suitable for research and development purposes where custom alignment strategies are needed. Users should note that they are responsible for implementing appropriate safeguards and monitoring the model's outputs.