MIstral-QUantized-70b_Miqu-1-70b-iMat.GGUF

Property	Value
Parameter Count	69B
Model Type	GGUF Quantized
Context Length	32k tokens
Author	Nexesenex

What is MIstral-QUantized-70b_Miqu-1-70b-iMat.GGUF?

This is a highly optimized quantized version of the Mistral Medium Alpha model, specifically designed to offer various quantization levels for different VRAM requirements while maintaining high performance. The model stands out for its exceptional French language capabilities and balanced performance across multiple tasks.

Implementation Details

The model features multiple quantization variants optimized with iMatrix, offering different VRAM requirements ranging from 16GB to 48GB. Key technical features include a theta of 1,000,000 (similar to CodeLlama) and a base context window of 32k tokens.

Full offload possible on 48GB VRAM: Q4_K_S and Q3_K_L variants
36GB VRAM support: Q3_K_M, Q3_K_S, Q3_K_XS, IQ3_XXS SOTA
24GB VRAM support: IQ2_XS SOTA
16GB VRAM support: IQ1_S variants

Core Capabilities

Superior French language understanding and generation
Low perplexity scores (less than 4 at 512ctx)
Strong performance on benchmarks like Hellaswag (88.3% for IQ3_XXS)
Efficient VRAM usage with multiple quantization options
32k context window support

Frequently Asked Questions

Q: What makes this model unique?

This model combines high performance with efficient quantization, offering exceptional French language capabilities while maintaining reasonable alignment and low censorship. Its unique theta value of 1,000,000 sets it apart from typical Llama 2 models.

Q: What are the recommended use cases?

The model is particularly well-suited for French language applications, general conversation, and tasks requiring extended context understanding. Different quantization variants allow deployment across various hardware configurations, from consumer-grade 16GB GPUs to professional 48GB setups.