Manticore-13B-GGML

Maintained By
TheBloke

Manticore-13B-GGML

PropertyValue
AuthorTheBloke
LicenseOther
Base ModelManticore 13B
FormatGGML (Various Quantizations)

What is Manticore-13B-GGML?

Manticore-13B-GGML is a quantized version of the OpenAccess AI Collective's Manticore 13B model, specifically optimized for CPU and GPU inference using llama.cpp. This model comes in multiple quantization levels ranging from 2-bit to 8-bit, offering different tradeoffs between model size, performance, and accuracy.

Implementation Details

The model implements various quantization methods, including both original llama.cpp methods (q4_0, q4_1, q5_0, q5_1, q8_0) and new k-quant methods (q2_K, q3_K_S, q3_K_M, q3_K_L, q4_K_S, q4_K_M, q5_K_S, q6_K). File sizes range from 5.43GB to 13.83GB, with corresponding RAM requirements between 7.93GB and 16.33GB.

  • Multiple quantization options for different use cases
  • Compatible with various UI frameworks including text-generation-webui and KoboldCpp
  • Supports GPU layer offloading for optimized performance
  • Implements new k-quant methods for improved efficiency

Core Capabilities

  • Efficient CPU and GPU inference using llama.cpp
  • Flexible deployment options with various quantization levels
  • Support for context window of 2048 tokens
  • Inherits base model's strong performance in instruction-following tasks

Frequently Asked Questions

Q: What makes this model unique?

This model offers a wide range of quantization options, making it highly versatile for different hardware configurations and use cases. It's particularly notable for implementing both traditional and new k-quant methods, offering users great flexibility in balancing size, speed, and quality.

Q: What are the recommended use cases?

The model is ideal for users who need to run large language models on consumer hardware. Different quantization levels suit different needs - from lightweight 2-bit versions for resource-constrained environments to 8-bit versions for maximum accuracy.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.