li-14b-v0.4-slerp0.1
Property | Value |
---|---|
Base Model | li-14b-v0.4 |
Merge Method | SLERP |
Model Size | 14B parameters |
HuggingFace URL | wanlige/li-14b-v0.4-slerp0.1 |
What is li-14b-v0.4-slerp0.1?
li-14b-v0.4-slerp0.1 is a sophisticated merged language model created using the SLERP (Spherical Linear Interpolation) method. It combines two powerful models: wanlige/li-14b-v0.4 and sthenno-com/miscii-14b-0218, resulting in a model that demonstrates impressive performance across various benchmarks.
Implementation Details
The model utilizes a complex merging strategy with varying interpolation weights across different layers. The self-attention and MLP components use distinct mixing ratios, with a sophisticated 25-point interpolation scheme for fine-grained control over the merger. The implementation uses float32 for merging and outputs in bfloat16 format for efficiency.
- Custom layer-wise interpolation with 48-layer architecture
- Specialized attention and MLP mixing ratios
- Advanced SLERP merging technique with precise weight control
Core Capabilities
- Outstanding performance in IFEval (0-Shot): 79.23%
- Strong mathematical reasoning abilities (MATH Lvl 5): 53.32%
- Solid performance in BBH (3-Shot): 50.88%
- Overall average score of 42.91 across benchmarks
Frequently Asked Questions
Q: What makes this model unique?
The model's uniqueness lies in its sophisticated SLERP merging strategy with carefully calibrated interpolation weights across different model components, resulting in balanced performance across various tasks.
Q: What are the recommended use cases?
Based on its benchmark performance, the model excels in inference tasks and mathematical reasoning, making it particularly suitable for applications requiring strong analytical capabilities and zero-shot learning.