Wizard Mega 13B GGML
Property | Value |
---|---|
Base Model | LLaMA 13B |
License | Other |
Training Datasets | ShareGPT, WizardLM, Wizard-Vicuna |
Available Formats | 4-bit, 5-bit, 8-bit GGML |
What is wizard-mega-13B-GGML?
Wizard Mega 13B GGML is a quantized version of the OpenAccess AI Collective's Wizard Mega 13B model, specifically optimized for CPU inference using llama.cpp. This model represents a significant advancement in making large language models accessible for local deployment, offering various quantization levels to balance performance and resource usage.
Implementation Details
The model was trained for two epochs on 8xA100 80GB GPUs using the Axolotl framework. It's available in multiple GGML quantization formats, ranging from 4-bit to 8-bit, with file sizes from 8.14GB to 14.6GB. The implementation requires the latest version of llama.cpp (post May 19th, 2023) for optimal performance.
- Multiple quantization options (q4_0, q4_1, q5_0, q5_1, q8_0)
- Optimized for CPU inference with llama.cpp
- RAM requirements ranging from 10.5GB to 17GB
- Compatible with text-generation-webui
Core Capabilities
- General text generation and conversation
- Code generation and technical writing
- Creative writing and storytelling
- Instruction following with filtered responses
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its efficient quantization options that make it possible to run a 13B parameter model on CPU hardware, while maintaining good performance through carefully optimized compression techniques.
Q: What are the recommended use cases?
The model is ideal for users who need to run large language models locally on CPU hardware, with different quantization options allowing for flexibility in balancing performance with resource constraints. It's particularly well-suited for general text generation, coding tasks, and creative writing.