WizardLM-7B-uncensored-GGML

Property	Value
Author	TheBloke (GGML conversion) / Eric Hartford (Original)
Model Size	7B parameters
License	Other
Format	GGML (various quantizations)

What is WizardLM-7B-uncensored-GGML?

WizardLM-7B-uncensored-GGML is a collection of quantized versions of Eric Hartford's WizardLM 7B Uncensored model, optimized for CPU and GPU inference using llama.cpp. The model is unique in that it was trained without alignment constraints, allowing users to implement their own alignment strategies separately.

Implementation Details

This repository provides multiple quantized versions ranging from 2-bit to 8-bit precision, offering different trade-offs between model size, performance, and accuracy. The quantization methods include both traditional llama.cpp methods (q4_0, q4_1, q5_0, q5_1, q8_0) and new k-quant methods (q2_K through q6_K).

File sizes range from 2.80GB (q2_K) to 7.16GB (q8_0)
RAM requirements range from 5.30GB to 9.66GB
Compatible with various frameworks including text-generation-webui, KoboldCpp, and llama-cpp-python

Core Capabilities

Efficient inference on both CPU and GPU
Multiple quantization options for different hardware constraints
Uncensored responses without built-in alignment
Support for context window of 2048 tokens
Compatible with major GGML-based frameworks

Frequently Asked Questions

Q: What makes this model unique?

This model provides uncensored outputs without built-in alignment, allowing users to implement their own ethical guidelines. It's available in multiple quantization formats optimized for different hardware configurations and use cases.

Q: What are the recommended use cases?

The model is suitable for research and development purposes where custom alignment strategies are needed. Users should note that they are responsible for implementing appropriate safeguards and monitoring the model's outputs.