Thor-v1.2-8b-1024k-i1-GGUF

Maintained By
mradermacher

Thor-v1.2-8b-1024k-i1-GGUF

PropertyValue
Parameter Count8.03B
Model TypeGGUF Transformer
Context Length1024k tokens
Authormradermacher

What is Thor-v1.2-8b-1024k-i1-GGUF?

Thor-v1.2-8b-1024k-i1-GGUF is a specialized quantized version of the Thor language model, optimized for efficient inference. This implementation features various quantization options ranging from 2.1GB to 6.7GB in size, making it adaptable to different hardware constraints while maintaining performance.

Implementation Details

The model utilizes imatrix quantization techniques and offers multiple compression levels, from IQ1_S (2.1GB) to Q6_K (6.7GB). It's specifically designed for deployment using the GGUF format, which enables efficient inference on consumer hardware.

  • Implements weighted/imatrix quantization methodology
  • Offers 23 different quantization variants
  • Supports extended context length of 1024k tokens
  • Features optimized performance on ARM architectures

Core Capabilities

  • Efficient inference with various memory footprint options
  • Optimized performance on different hardware architectures
  • Balance between model size and quality through different quantization levels
  • Support for both standard and ARM-specific optimizations

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its variety of quantization options, particularly the imatrix quantizations, which offer superior quality-to-size ratios compared to traditional quantization methods. It's also notable for maintaining extended context length while being highly compressed.

Q: What are the recommended use cases?

For optimal performance, the Q4_K_M variant (5.0GB) is recommended as it offers the best balance of speed and quality. For resource-constrained environments, the IQ2_M variant (3.0GB) provides a good compromise between size and performance.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.