WizardLM-13B-Uncensored-GGML

Property	Value
Author	TheBloke
Base Model	WizardLM 13B Uncensored
Format	GGML
License	Other

What is WizardLM-13B-Uncensored-GGML?

WizardLM-13B-Uncensored-GGML is a specialized conversion of Eric Hartford's WizardLM 13B Uncensored model into the GGML format, optimized for efficient CPU and GPU inference. This model stands out for offering multiple quantization options, ranging from 2-bit to 8-bit, allowing users to balance between model size, performance, and accuracy based on their specific needs.

Implementation Details

The model is available in various quantization formats, including both original llama.cpp methods (q4_0, q4_1, q5_0, q5_1, q8_0) and new k-quant methods (q2_K, q3_K_S, q3_K_M, q3_K_L, q4_K_S, q4_K_M, q5_K_S, q6_K). File sizes range from 5.43GB for the q2_K version to 13.83GB for the q8_0 version.

Supports multiple inference frameworks including text-generation-webui, KoboldCpp, and llama-cpp-python
Provides flexible RAM requirements from 7.93GB to 16.33GB depending on quantization
Implements advanced k-quant methods for optimal performance-size balance

Core Capabilities

Uncensored text generation without built-in alignment constraints
Efficient CPU+GPU inference with various quantization options
Context window of 2048 tokens
Compatible with multiple popular inference frameworks

Frequently Asked Questions

Q: What makes this model unique?

This model offers unprecedented flexibility in deployment through multiple quantization options while maintaining the capabilities of WizardLM 13B Uncensored. It's specifically designed for efficient inference on consumer hardware.

Q: What are the recommended use cases?

The model is ideal for users requiring uncensored text generation capabilities with flexible deployment options, particularly those needing to balance between model size, performance, and accuracy. It's especially suitable for running on consumer-grade hardware with various resource constraints.