Violet_Twilight-v0.2
Property | Value |
---|---|
Author | Epiculous |
Model Type | SLERP Merged Language Model |
Base Models | Azure_Dusk-v0.2, Crimson_Dawn-v0.2 |
Training Format | ChatML |
Model URL | Hugging Face |
What is Violet_Twilight-v0.2?
Violet_Twilight-v0.2 is an innovative language model created through a sophisticated SLERP (Spherical Linear Interpolation) merge of two base models: Azure_Dusk-v0.2 and Crimson_Dawn-v0.2. The model demonstrates balanced performance across various evaluation metrics, with particularly strong results in IFEval testing.
Implementation Details
The model utilizes a carefully crafted merging configuration with specialized attention to layer distribution. The merge employs varying interpolation weights across different model components, with self-attention layers using values [0, 0.5, 0.3, 0.7, 1] and MLP layers using [1, 0.5, 0.7, 0.3, 0]. The model implementation uses bfloat16 dtype for efficient computation.
- Trained on ChatML format for consistent dialogue structure
- Implements specialized sampling settings including Smooth Creativity and Variant Chimera
- Layer-specific merge configuration across 40 layers
Core Capabilities
- Strong performance in IFEval (0-Shot): 45.32
- Balanced BBH (3-Shot) performance: 23.94
- Competitive MMLU-PRO (5-shot) results: 23.45
- Overall average performance: 18.53 across evaluation metrics
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its specialized SLERP merge configuration that combines the strengths of Azure_Dusk and Crimson_Dawn models using carefully calibrated interpolation weights for different layer types.
Q: What are the recommended use cases?
The model is particularly well-suited for dialogue applications using ChatML format, with strong performance in zero-shot and few-shot scenarios. It offers various sampling settings for different creativity levels and use cases.