miscii-14b-0218
Property | Value |
---|---|
Model Size | 14B parameters |
Base Model | tempesthenno-ppo-enchanted |
Hugging Face | Link |
Architecture | Merged LLM (Model Stock method) |
What is miscii-14b-0218?
miscii-14b-0218 is a sophisticated language model created through merging multiple fine-tuned checkpoints using the Model Stock merge method. It builds upon the tempesthenno-ppo-enchanted base model and incorporates various SFT checkpoints to create a robust and capable language model.
Implementation Details
The model was implemented using mergekit with specific configuration parameters including int8_mask and normalization. It utilizes bfloat16 as the output dtype and combines five different checkpoint versions of the tempesthenno-sft-0218 model series.
- Uses Model Stock merge methodology
- Implements int8 masking and normalization
- Combines multiple SFT checkpoint iterations
- Optimized with bfloat16 precision
Core Capabilities
- Strong performance on IFEval (0-Shot): 76.56%
- Solid BBH (3-Shot) performance: 50.64%
- MATH Level 5 (4-Shot): 51.44%
- Overall average benchmark score: 42.90%
Frequently Asked Questions
Q: What makes this model unique?
The model's uniqueness stems from its merger of multiple training checkpoints using the Model Stock method, combining the strengths of different training stages to create a more robust model. It particularly excels in zero-shot inference tasks as demonstrated by its IFEval performance.
Q: What are the recommended use cases?
Based on its benchmark performance, the model is particularly well-suited for tasks requiring zero-shot inference and mathematical reasoning. It shows strong capabilities in both basic and complex problem-solving scenarios, making it suitable for educational and analytical applications.