HomerCreativeAnvita-Mix-Qw7B
Property | Value |
---|---|
Parameter Count | 7.62B |
Model Type | Text Generation |
Architecture | Qwen2-based Merged Model |
Tensor Type | BF16 |
What is HomerCreativeAnvita-Mix-Qw7B?
HomerCreativeAnvita-Mix-Qw7B is a sophisticated merged language model created using the mergekit framework, combining two powerful Qwen2.5-7B variants. Currently ranked #1 on the Open LLM Leaderboard among models up to 13B parameters, it demonstrates exceptional performance across various tasks, particularly in instruction following.
Implementation Details
The model employs the SLERP merge method to combine ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix and ZeroXClem/Qwen2.5-7B-HomerCreative-Mix, using a carefully calibrated configuration with varying attention and MLP layer weights.
- Custom attention weight distribution across layers
- Optimized MLP layer merging strategy
- BFloat16 precision for efficient inference
- 28-layer architecture with sophisticated merging patterns
Core Capabilities
- 78.08% accuracy on IFEval (0-Shot)
- 36.98% normalized accuracy on BBH (3-Shot)
- 31.04% exact match on MATH Level 5 (4-Shot)
- 38.28% accuracy on MMLU-PRO (5-shot)
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive SLERP merge configuration and balanced performance across diverse tasks, particularly its top ranking for sub-13B models, sets it apart in the language model landscape.
Q: What are the recommended use cases?
Given its strong performance on instruction following and various academic tasks, it's well-suited for general text generation, educational applications, and complex reasoning tasks.