LamarckInfusion-14B-v2
Property | Value |
---|---|
Model Size | 14B parameters |
Base Model | Lamarck-14B-v0.7-Fusion |
Merge Method | SLERP with dual-slice architecture |
Model URL | Hugging Face Repository |
What is LamarckInfusion-14B-v2?
LamarckInfusion-14B-v2 is an innovative language model that combines the strengths of three powerful models: Lamarck-14B-v0.7-Fusion, Chocolatine-2-14B-Instruct-v2.0.3, and Qwenvergence-14B-v12-Prose-DS. Using a sophisticated SLERP (Spherical Linear Interpolation) merge method with a unique dual-slice architecture, it achieves enhanced stability and eloquence while maintaining the core capabilities of its parent models.
Implementation Details
The model implements a novel two-slice SLERP merge configuration, with carefully calibrated attention and MLP layer weightings. The first slice (layers 0-24) combines Lamarck and Chocolatine models, while the second slice (layers 24-48) merges Lamarck with Qwenvergence, creating a balanced fusion of capabilities.
- Advanced dual-slice architecture with separate attention and MLP weightings
- Precision-engineered layer transitions using float32 processing and bfloat16 output
- Optimized self-attention and MLP layer configurations across model depth
Core Capabilities
- Enhanced stability from Chocolatine's instruction-following abilities
- Improved prose generation from Qwenvergence's capabilities
- Maintained eloquence from Lamarck's base characteristics
- Balanced performance across different tasks due to strategic model fusion
Frequently Asked Questions
Q: What makes this model unique?
The model's dual-slice architecture and precise SLERP merge configuration sets it apart, allowing it to effectively combine the strengths of three different models while maintaining stability and coherence.
Q: What are the recommended use cases?
Given its balanced capabilities, the model is well-suited for general-purpose applications, particularly those requiring both instruction-following and high-quality prose generation.