SuperNova-Medius-GGUF

Maintained By
bartowski

SuperNova-Medius-GGUF

PropertyValue
Parameter Count14.8B
LicenseApache 2.0
Authorbartowski
Base Modelarcee-ai/SuperNova-Medius

What is SuperNova-Medius-GGUF?

SuperNova-Medius-GGUF is a comprehensive collection of GGUF quantized versions of the SuperNova-Medius language model, optimized using llama.cpp. It offers various quantization levels to balance performance and resource requirements, ranging from the full F16 weights at 29.55GB to highly compressed versions as small as 4.31GB.

Implementation Details

The model uses an advanced quantization methodology with imatrix calibration, providing multiple variants optimized for different hardware configurations and use cases. The implementation includes special considerations for embed/output weights in certain variants, potentially improving output quality.

  • Multiple quantization options from Q8_0 to IQ2_XXS
  • Specialized ARM-optimized versions (Q4_0_X_X series)
  • Custom calibration dataset for optimal performance
  • Support for various inference endpoints including LM Studio

Core Capabilities

  • Text generation with chat-style formatting
  • Flexible deployment options for different hardware configurations
  • Optimized performance on both CPU and GPU setups
  • Support for conversation-style interactions using specific prompt format

Frequently Asked Questions

Q: What makes this model unique?

The model's unique feature is its extensive range of quantization options, allowing users to choose the perfect balance between model size and performance for their specific hardware constraints. The implementation includes cutting-edge quantization techniques like I-quants and special handling of embedding weights.

Q: What are the recommended use cases?

For most users, the Q4_K_M variant (8.99GB) is recommended as a balanced option. Users with limited RAM should consider the IQ3/IQ2 series, while those prioritizing quality should opt for Q6_K_L or Q5_K_L variants. The model is particularly well-suited for conversational AI applications and general text generation tasks.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.