c4ai-command-r-plus-iMat.GGUF

Maintained By
dranger003

c4ai-command-r-plus-iMat.GGUF

PropertyValue
Parameter Count104B
Licensecc-by-nc-4.0
Context Length131,072 tokens
Architecture64 layers with importance matrix quantization

What is c4ai-command-r-plus-iMat.GGUF?

c4ai-command-r-plus-iMat.GGUF is a sophisticated quantized version of the C4AI Command R+ model, featuring importance matrix-based compression for optimal performance. This 104B parameter model supports multiple quantization levels, from IQ1 to Q8_0, enabling deployment across various hardware configurations while maintaining impressive perplexity scores.

Implementation Details

The model utilizes GGUF format with importance matrix training on ~100K tokens from wiki.train.raw. It offers multiple quantization options, with perplexity scores ranging from 8.2530 (IQ1_S) to 4.3845 (FP16), allowing users to balance between model size and performance.

  • Supports context length of 131,072 tokens
  • Implements BPE pre-tokenization
  • Features various quantization levels with size ranging from 21.59 GiB to 193.38 GiB
  • Utilizes importance matrix for enhanced compression efficiency

Core Capabilities

  • Multilingual support for 10 languages including English, French, Spanish, and others
  • Advanced Retrieval Augmented Generation (RAG)
  • Multi-step tool use capabilities
  • Specialized in reasoning, summarization, and question answering
  • Supports split model loading for efficient resource management

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its implementation of importance matrix quantization, allowing for highly efficient compression while maintaining performance. It offers various quantization options suitable for different hardware configurations, with some sweet spots being IQ4_XS, IQ3_M/IQ3_S, and IQ2_M for balance between size and performance.

Q: What are the recommended use cases?

The model excels in sophisticated tasks requiring multi-step reasoning, RAG capabilities, and tool use across multiple languages. It's particularly well-suited for applications needing advanced reasoning, summarization, and question answering capabilities while operating under various hardware constraints.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.