calme-2.2-qwen2-7b

MaziyarPanahi

Fine-tuned version of Qwen2-7B with improved benchmark performance, featuring 7B parameters and ChatML prompt format. Achieves 23.23% average score on key benchmarks.

Property	Value
Base Model	Qwen2-7B
Parameter Count	7 Billion
Author	MaziyarPanahi
Format Support	GGUF Quantized Available

What is calme-2.2-qwen2-7b?

calme-2.2-qwen2-7b is a sophisticated fine-tuned version of the Qwen2-7B language model, designed to enhance performance across multiple benchmarks. This model implements the ChatML prompt template for structured interactions and offers both standard and quantized GGUF versions for flexible deployment options.

Implementation Details

The model utilizes a specific ChatML prompt format for interaction, structured with system, user, and assistant messages. It can be implemented using either the high-level pipeline API or direct model loading through Hugging Face's transformers library. The model architecture maintains the base Qwen2-7B structure while introducing optimizations for improved performance.

Supports both standard and GGUF quantized versions
Implements ChatML prompt template for structured interactions
Easy integration with transformers pipeline
Comprehensive benchmark evaluation results

Core Capabilities

IFEval (0-Shot): 35.97% accuracy
BBH (3-Shot): 33.11% performance
MATH Level 5 (4-Shot): 19.34% accuracy
MMLLU-PRO (5-shot): 32.21% performance
Average benchmark score: 23.23%

Frequently Asked Questions

Q: What makes this model unique?

This model stands out through its comprehensive optimization of the Qwen2-7B base model, achieving significant improvements across various benchmarks. It offers both standard and quantized versions, making it versatile for different deployment scenarios.

Q: What are the recommended use cases?

Given its benchmark performance, the model is well-suited for tasks requiring zero-shot to few-shot learning, particularly in areas like inference, mathematical reasoning, and professional knowledge testing. The availability of GGUF quantized versions makes it appropriate for resource-constrained environments.