li-14b-v0.4-slerp0.1

Maintained By
wanlige

li-14b-v0.4-slerp0.1

PropertyValue
Base Modelli-14b-v0.4
Merge MethodSLERP
Model Size14B parameters
HuggingFace URLwanlige/li-14b-v0.4-slerp0.1

What is li-14b-v0.4-slerp0.1?

li-14b-v0.4-slerp0.1 is a sophisticated merged language model created using the SLERP (Spherical Linear Interpolation) method. It combines two powerful models: wanlige/li-14b-v0.4 and sthenno-com/miscii-14b-0218, resulting in a model that demonstrates impressive performance across various benchmarks.

Implementation Details

The model utilizes a complex merging strategy with varying interpolation weights across different layers. The self-attention and MLP components use distinct mixing ratios, with a sophisticated 25-point interpolation scheme for fine-grained control over the merger. The implementation uses float32 for merging and outputs in bfloat16 format for efficiency.

  • Custom layer-wise interpolation with 48-layer architecture
  • Specialized attention and MLP mixing ratios
  • Advanced SLERP merging technique with precise weight control

Core Capabilities

  • Outstanding performance in IFEval (0-Shot): 79.23%
  • Strong mathematical reasoning abilities (MATH Lvl 5): 53.32%
  • Solid performance in BBH (3-Shot): 50.88%
  • Overall average score of 42.91 across benchmarks

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its sophisticated SLERP merging strategy with carefully calibrated interpolation weights across different model components, resulting in balanced performance across various tasks.

Q: What are the recommended use cases?

Based on its benchmark performance, the model excels in inference tasks and mathematical reasoning, making it particularly suitable for applications requiring strong analytical capabilities and zero-shot learning.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.