calme-3.2-instruct-78b

Property	Value
Parameter Count	78B
Base Model	Qwen2.5-72B
Model Type	Instruction-tuned LLM
Hugging Face	MaziyarPanahi/calme-3.2-instruct-78b

What is calme-3.2-instruct-78b?

calme-3.2-instruct-78b is an advanced language model that builds upon the Qwen2.5-72B architecture through innovative model merging and custom fine-tuning. This experimental model demonstrates impressive capabilities across various benchmarks, achieving an average score of 52.02 on the OpenLLM leaderboard, with particularly strong performance in IFEval (80.63) and MMLU-PRO (70.03).

Implementation Details

The model utilizes the ChatML prompt template and offers multiple deployment options, including quantized versions in GGUF and EXL2 formats. It's implemented using the Hugging Face Transformers library and can be easily integrated using either pipeline or direct model loading approaches.

Custom fine-tuning on specialized datasets
Model merging technique for enhanced capabilities
Multiple quantization options (GGUF and EXL2 4.5 bpw)
ChatML prompt format support

Core Capabilities

Strong performance in zero-shot inference (IFEval: 80.63)
Robust reasoning abilities (BBH 3-shot: 62.61)
Advanced mathematical processing (MATH Lvl 5: 39.95)
Professional knowledge assessment (MMLU-PRO: 70.03)

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its innovative approach of merging Qwen2.5-72B with itself and subsequent fine-tuning on custom datasets, resulting in enhanced performance across various tasks while maintaining flexibility through multiple quantization options.

Q: What are the recommended use cases?

Given its strong performance across multiple benchmarks, the model is well-suited for generic domain applications, particularly those requiring zero-shot inference, professional knowledge assessment, and complex reasoning tasks. However, as an experimental model, it's recommended to implement appropriate safeguards and human oversight in production environments.