li-14b-v0.4-slerp0.1

Property	Value
Base Model	li-14b-v0.4
Merge Method	SLERP
Model Size	14B parameters
HuggingFace URL	wanlige/li-14b-v0.4-slerp0.1

What is li-14b-v0.4-slerp0.1?

li-14b-v0.4-slerp0.1 is a sophisticated merged language model created using the SLERP (Spherical Linear Interpolation) method. It combines two powerful models: wanlige/li-14b-v0.4 and sthenno-com/miscii-14b-0218, resulting in a model that demonstrates impressive performance across various benchmarks.

Implementation Details

The model utilizes a complex merging strategy with varying interpolation weights across different layers. The self-attention and MLP components use distinct mixing ratios, with a sophisticated 25-point interpolation scheme for fine-grained control over the merger. The implementation uses float32 for merging and outputs in bfloat16 format for efficiency.

Custom layer-wise interpolation with 48-layer architecture
Specialized attention and MLP mixing ratios
Advanced SLERP merging technique with precise weight control

Core Capabilities

Outstanding performance in IFEval (0-Shot): 79.23%
Strong mathematical reasoning abilities (MATH Lvl 5): 53.32%
Solid performance in BBH (3-Shot): 50.88%
Overall average score of 42.91 across benchmarks

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its sophisticated SLERP merging strategy with carefully calibrated interpolation weights across different model components, resulting in balanced performance across various tasks.

Q: What are the recommended use cases?

Based on its benchmark performance, the model excels in inference tasks and mathematical reasoning, making it particularly suitable for applications requiring strong analytical capabilities and zero-shot learning.