Thor-v1.1e-8b-1024k-i1-GGUF

Maintained By
mradermacher

Thor-v1.1e-8b-1024k-i1-GGUF

PropertyValue
Parameter Count8.03B
Model TypeGGUF Quantized
Context Length1024k tokens
LanguageEnglish

What is Thor-v1.1e-8b-1024k-i1-GGUF?

Thor-v1.1e-8b-1024k-i1-GGUF is a quantized version of the Thor language model, specifically optimized for inference using the GGUF format. This model offers various quantization options ranging from 2.1GB to 6.7GB in size, providing flexible options for different hardware configurations and use cases.

Implementation Details

The model features imatrix quantization with multiple compression levels, from IQ1 to Q6_K. It's implemented using the Transformers library and includes specialized optimizations for different hardware architectures, including ARM and SVE.

  • Multiple quantization options (IQ1_S through Q6_K)
  • Size variants ranging from 2.1GB to 6.7GB
  • Optimized for different hardware architectures
  • 1024k context window support

Core Capabilities

  • Efficient inference with various quality-size tradeoffs
  • Hardware-specific optimizations for ARM processors
  • Support for extensive context processing
  • Balanced performance across different quantization levels

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its variety of quantization options and imatrix implementations, allowing users to choose the optimal balance between model size, quality, and performance for their specific needs.

Q: What are the recommended use cases?

For general use, the Q4_K_M variant (5.0GB) is recommended as it offers a good balance of speed and quality. For resource-constrained environments, the IQ2 variants provide acceptable quality at smaller sizes.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.