cursa-o1-7b-v1.1

Property	Value
Author	marcuscedricridia
Model Type	Merged Language Model
Base Architecture	7B Parameters
Hugging Face URL	Link

What is cursa-o1-7b-v1.1?

cursa-o1-7b-v1.1 is a sophisticated language model created through a careful merger of two foundation models using the SLERP (Spherical Linear Interpolation) technique. The model combines marcuscedricridia/pre-cursa-o1-v1.2 and marcuscedricridia/post-cursa-o1 to create an optimized language model with enhanced capabilities.

Implementation Details

The model employs a complex merging strategy using bfloat16 precision and implements a sophisticated layer-wise combination approach. The merge configuration utilizes varying interpolation weights across different components, with special attention to self-attention and MLP layers.

SLERP merge method with 28-layer architecture
Optimized attention weights ranging from 0.0 to 1.0
Inverse MLP layer weights from 1.0 to 0.0
Balanced normalization layers with 0.5 weighting

Core Capabilities

Advanced language understanding through merged model characteristics
Optimized performance with custom layer weightings
Balanced attention and processing mechanisms
Support for union tokenizer configuration

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its carefully crafted merge configuration, utilizing variable weights across different neural network components and implementing the SLERP method for optimal model combination.

Q: What are the recommended use cases?

While specific use cases aren't detailed in the model card, the architecture suggests it's suitable for general language tasks requiring balanced attention and processing capabilities.

cursa-o1-7b-v1.1

cursa-o1-7b-v1.1

What is cursa-o1-7b-v1.1?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models