stable-vicuna-13B-GGML

Maintained By
TheBloke

Stable Vicuna 13B GGML

PropertyValue
Base ModelCarperAI Stable Vicuna 13B
Parameter Count13 Billion
LicenseOther (Non-commercial)
PaperBased on LLaMA (arXiv:2302.13971)
Quantization Options2-bit to 8-bit GGML

What is stable-vicuna-13B-GGML?

Stable Vicuna 13B GGML is a highly optimized version of CarperAI's Stable Vicuna model, specifically converted for efficient CPU and GPU inference using the GGML framework. This implementation offers various quantization levels, from 2-bit to 8-bit, providing flexible trade-offs between model size, performance, and accuracy.

Implementation Details

The model comes in multiple quantization variants, each optimized for different use cases. The implementation includes both traditional quantization methods (q4_0, q4_1, q5_0, q5_1, q8_0) and newer k-quant methods (q2_K, q3_K_S, q3_K_M, q3_K_L, q4_K_S, q4_K_M, q5_K_S, q6_K). File sizes range from 5.43GB for the q2_K version to 13.83GB for the q8_0 version.

  • Compatible with llama.cpp and various UI frameworks
  • Supports GPU layer offloading for improved performance
  • Includes new k-quant methods for better compression efficiency
  • Requires 7.93GB to 16.33GB RAM depending on quantization level

Core Capabilities

  • Optimized for conversation and instruction-following tasks
  • Supports both CPU and GPU inference
  • Multiple quantization options for different hardware constraints
  • Integration with popular frameworks like text-generation-webui and KoboldCpp

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its variety of quantization options and optimization for CPU/GPU inference, making it accessible for users with different hardware capabilities while maintaining good performance.

Q: What are the recommended use cases?

The model is ideal for conversational AI applications, text generation, and instruction-following tasks. Users can choose different quantization levels based on their hardware constraints and performance requirements.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.