Phantasor-137M-GGUF

Maintained By
mradermacher

Phantasor-137M-GGUF

PropertyValue
Model Size137M parameters
Authormradermacher
FormatGGUF
SourceXeTute/Phantasor-137M

What is Phantasor-137M-GGUF?

Phantasor-137M-GGUF is a quantized version of the original Phantasor-137M model, specifically optimized for efficient deployment using the GGUF format. This model offers various quantization options ranging from Q2_K to F16, providing flexibility in terms of model size and performance tradeoffs.

Implementation Details

The model implements multiple quantization types, with file sizes ranging from 0.2GB to 0.4GB. Notable implementations include Q4_K_S and Q4_K_M variants which are recommended for their balance of speed and quality, and Q8_0 which offers the highest quality among the quantized versions.

  • Multiple quantization options (Q2_K through F16)
  • File sizes ranging from 0.2GB to 0.4GB
  • Optimized for different performance requirements
  • Static quantization implementation

Core Capabilities

  • Fast inference with Q4_K variants
  • High-quality output with Q6_K and Q8_0 versions
  • Flexible deployment options for different hardware constraints
  • Efficient memory usage through various quantization levels

Frequently Asked Questions

Q: What makes this model unique?

The model offers a comprehensive range of quantization options, allowing users to choose between different speed-quality tradeoffs. The Q4_K variants are particularly notable for their balance of performance and quality.

Q: What are the recommended use cases?

For optimal performance, the Q4_K_S and Q4_K_M variants are recommended for general use, while Q8_0 is suggested for applications requiring the highest quality output. The Q6_K version offers very good quality at a smaller size.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.