HomerCreativeAnvita-Mix-Qw7B

Maintained By
suayptalha

HomerCreativeAnvita-Mix-Qw7B

PropertyValue
Parameter Count7.62B
Model TypeText Generation
ArchitectureQwen2-based Merged Model
Tensor TypeBF16

What is HomerCreativeAnvita-Mix-Qw7B?

HomerCreativeAnvita-Mix-Qw7B is a sophisticated merged language model created using the mergekit framework, combining two powerful Qwen2.5-7B variants. Currently ranked #1 on the Open LLM Leaderboard among models up to 13B parameters, it demonstrates exceptional performance across various tasks, particularly in instruction following.

Implementation Details

The model employs the SLERP merge method to combine ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix and ZeroXClem/Qwen2.5-7B-HomerCreative-Mix, using a carefully calibrated configuration with varying attention and MLP layer weights.

  • Custom attention weight distribution across layers
  • Optimized MLP layer merging strategy
  • BFloat16 precision for efficient inference
  • 28-layer architecture with sophisticated merging patterns

Core Capabilities

  • 78.08% accuracy on IFEval (0-Shot)
  • 36.98% normalized accuracy on BBH (3-Shot)
  • 31.04% exact match on MATH Level 5 (4-Shot)
  • 38.28% accuracy on MMLU-PRO (5-shot)

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive SLERP merge configuration and balanced performance across diverse tasks, particularly its top ranking for sub-13B models, sets it apart in the language model landscape.

Q: What are the recommended use cases?

Given its strong performance on instruction following and various academic tasks, it's well-suited for general text generation, educational applications, and complex reasoning tasks.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.