Mistral-Small-Sisyphus-24b-2503

Property	Value
Parameter Count	24 billion
Base Model	Mistral
Training Template	v7-Tekken
Model URL	HuggingFace

What is Mistral-Small-Sisyphus-24b-2503?

Mistral-Small-Sisyphus-24b-2503 is a fine-tuned version of the 24B parameter Mistral base model, developed by allura-org. Initially intended for multi-turn instruction following, the model unexpectedly became oriented towards roleplay due to a configuration quirk during training.

Implementation Details

The model implements a Claude-like system with support for explicit reasoning blocks. It performs well across various temperature settings when used with min-p or top-p sampling methods. The implementation includes support for specialized thought processing through <think> tags.

Supports both reasoning and non-reasoning system prompts
Uses v7-Tekken instruct template
Implements Claude-style interaction patterns
Features explicit thought processing capabilities

Core Capabilities

Multi-turn conversation handling
Structured reasoning using thought blocks
Temperature-flexible response generation
Roleplay capabilities (unintended feature)

Frequently Asked Questions

Q: What makes this model unique?

The model's unique feature is its ability to handle both standard instruction-following and explicit reasoning through thought blocks, while maintaining coherence across different temperature settings. Its unintended roleplay capabilities add an interesting dimension to its functionality.

Q: What are the recommended use cases?

The model is well-suited for applications requiring structured reasoning, multi-turn conversations, and potentially roleplay scenarios. It works best when used with temperature control methods like min-p or top-p sampling.