Mistral-Large-Instruct-2411-2.75bpw-h6-exl2
Property | Value |
---|---|
Base Model | Mistral-Large-Instruct-2411 |
Parameters | 123B |
License | Mistral Research License (MRL) |
Context Length | 32K tokens |
Supported Languages | 10+ languages |
What is Mistral-Large-Instruct-2411-2.75bpw-h6-exl2?
This model is an EXL2-quantized version of the Mistral-Large-Instruct-2411, optimized for systems with 48GB VRAM while maintaining the powerful capabilities of the original 123B parameter model. It features state-of-the-art reasoning, knowledge, and coding capabilities, with improved long context handling and function calling abilities.
Implementation Details
The model uses 2.75 bits-per-weight quantization with h6 precision, enabling efficient deployment on systems with limited VRAM while preserving model performance. It supports a 32K context window with Q4 cache optimization, making it suitable for extensive document processing and complex reasoning tasks.
- Optimized for vLLM deployment
- Supports multiple deployment configurations
- Enhanced system prompt handling
- Native function calling capabilities
Core Capabilities
- Multi-lingual support for 10+ languages including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese
- Proficient coding abilities across 80+ programming languages
- Advanced mathematical and reasoning capabilities
- Robust context adherence for RAG applications
- Agent-centric design with native function calling
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its efficient quantization that enables running a 123B parameter model on systems with just 48GB VRAM, while maintaining high performance across multiple languages and tasks.
Q: What are the recommended use cases?
The model is specifically designed for research purposes under the Mistral Research License. It excels in coding tasks, mathematical reasoning, and multi-lingual applications, particularly where context length and memory efficiency are crucial.