Phantasor-137M-GGUF
Property | Value |
---|---|
Model Size | 137M parameters |
Author | mradermacher |
Format | GGUF |
Source | XeTute/Phantasor-137M |
What is Phantasor-137M-GGUF?
Phantasor-137M-GGUF is a quantized version of the original Phantasor-137M model, specifically optimized for efficient deployment using the GGUF format. This model offers various quantization options ranging from Q2_K to F16, providing flexibility in terms of model size and performance tradeoffs.
Implementation Details
The model implements multiple quantization types, with file sizes ranging from 0.2GB to 0.4GB. Notable implementations include Q4_K_S and Q4_K_M variants which are recommended for their balance of speed and quality, and Q8_0 which offers the highest quality among the quantized versions.
- Multiple quantization options (Q2_K through F16)
- File sizes ranging from 0.2GB to 0.4GB
- Optimized for different performance requirements
- Static quantization implementation
Core Capabilities
- Fast inference with Q4_K variants
- High-quality output with Q6_K and Q8_0 versions
- Flexible deployment options for different hardware constraints
- Efficient memory usage through various quantization levels
Frequently Asked Questions
Q: What makes this model unique?
The model offers a comprehensive range of quantization options, allowing users to choose between different speed-quality tradeoffs. The Q4_K variants are particularly notable for their balance of performance and quality.
Q: What are the recommended use cases?
For optimal performance, the Q4_K_S and Q4_K_M variants are recommended for general use, while Q8_0 is suggested for applications requiring the highest quality output. The Q6_K version offers very good quality at a smaller size.