SicariusSicariiStuff_X-Ray_Alpha-GGUF

SicariusSicariiStuff_X-Ray_Alpha-GGUF

bartowski

A comprehensive collection of GGUF quantizations of X-Ray_Alpha model, offering various compression levels from 7.77GB to 1.54GB with different quality-size tradeoffs.

PropertyValue
Original ModelX-Ray Alpha
Authorbartowski
Quantization Methodllama.cpp imatrix
Size Range1.54GB - 7.77GB

What is SicariusSicariiStuff_X-Ray_Alpha-GGUF?

This is a comprehensive collection of GGUF quantizations of the X-Ray_Alpha model, created using llama.cpp's imatrix quantization technique. The collection offers various compression levels to accommodate different hardware capabilities and performance requirements, ranging from full BF16 weights (7.77GB) to highly compressed IQ2_M format (1.54GB).

Implementation Details

The model uses a specific prompt format: <bos><start_of_turn>user {prompt}<end_of_turn> <start_of_turn>model <end_of_turn>. It's notable that the model doesn't support System prompts. The quantizations were created using llama.cpp release b4925.

  • Multiple quantization options offering different quality-size tradeoffs
  • Special versions with Q8_0 for embed and output weights
  • Support for online repacking for ARM and AVX CPU inference
  • Optimized versions for different hardware configurations

Core Capabilities

  • High-quality compression with Q6_K_L and Q5_K variants
  • Efficient memory usage with IQ3 and IQ4 variants
  • Automatic weight repacking for ARM and AVX systems
  • Flexible deployment options across different hardware configurations

Frequently Asked Questions

Q: What makes this model unique?

This model collection stands out for its comprehensive range of quantization options, allowing users to choose the perfect balance between model size and quality for their specific hardware constraints. The implementation of imatrix quantization and special handling of embedding/output weights provides optimal performance across different scenarios.

Q: What are the recommended use cases?

For most users, the Q4_K_M variant (2.49GB) is recommended as the default choice, offering good quality and reasonable size. For high-end systems, Q6_K_L (3.35GB) provides near-perfect quality, while users with limited RAM can opt for Q3_K_M (2.10GB) or IQ3_M (1.99GB) variants.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026