MIstral-QUantized-70b_Miqu-1-70b-iMat.GGUF
Property | Value |
---|---|
Parameter Count | 69B |
Model Type | GGUF Quantized |
Context Length | 32k tokens |
Author | Nexesenex |
What is MIstral-QUantized-70b_Miqu-1-70b-iMat.GGUF?
This is a highly optimized quantized version of the Mistral Medium Alpha model, specifically designed to offer various quantization levels for different VRAM requirements while maintaining high performance. The model stands out for its exceptional French language capabilities and balanced performance across multiple tasks.
Implementation Details
The model features multiple quantization variants optimized with iMatrix, offering different VRAM requirements ranging from 16GB to 48GB. Key technical features include a theta of 1,000,000 (similar to CodeLlama) and a base context window of 32k tokens.
- Full offload possible on 48GB VRAM: Q4_K_S and Q3_K_L variants
- 36GB VRAM support: Q3_K_M, Q3_K_S, Q3_K_XS, IQ3_XXS SOTA
- 24GB VRAM support: IQ2_XS SOTA
- 16GB VRAM support: IQ1_S variants
Core Capabilities
- Superior French language understanding and generation
- Low perplexity scores (less than 4 at 512ctx)
- Strong performance on benchmarks like Hellaswag (88.3% for IQ3_XXS)
- Efficient VRAM usage with multiple quantization options
- 32k context window support
Frequently Asked Questions
Q: What makes this model unique?
This model combines high performance with efficient quantization, offering exceptional French language capabilities while maintaining reasonable alignment and low censorship. Its unique theta value of 1,000,000 sets it apart from typical Llama 2 models.
Q: What are the recommended use cases?
The model is particularly well-suited for French language applications, general conversation, and tasks requiring extended context understanding. Different quantization variants allow deployment across various hardware configurations, from consumer-grade 16GB GPUs to professional 48GB setups.