Llama-3.3-70B-Vulpecula-r1

Sao10K

A 70B parameter LLaMA-based model optimized for creative writing and reasoning, featuring enhanced thinking capabilities and improved steerability

Property	Value
Base Model	Meta's LLaMA 3.3
Parameters	70B
License	Llama 3.3 Community License Agreement
Training Data	~270M Tokens
Authors	Sao10K & GradientPutri

What is Llama-3.3-70B-Vulpecula-r1?

Llama-3.3-70B-Vulpecula-r1 is an advanced language model built on Meta's LLaMA 3.3 architecture, specifically engineered for enhanced creative writing and reasoning capabilities. The model incorporates both Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) techniques, with inspiration drawn from Deepseek-R1's thinking-based approach.

Implementation Details

The model was trained over 2 epochs using approximately 270M tokens, with 210M being trainable. It employs the paged_ademamix_8bit optimizer with a cosine learning rate scheduler, utilizing a learning rate of 0.00002 and weight decay of 0.01. The training configuration includes a global batch size of 32, achieved through 4 GPUs with batch size 2 and gradient accumulation steps of 4.

Specialized thinking mode activated via prefix
Enhanced steerability and instruct-roleplay capabilities
Implements Llama-3-Instruct formatting
Optimized sampling parameters with temperature 0.75 and min_p 0.1

Core Capabilities

Advanced creative writing and storytelling
Improved reasoning and thinking processes
Natural chat and roleplaying interactions
Enhanced instruction following
Clean, high-quality outputs without toxic content

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its integration of thinking-based approaches inspired by Deepseek-R1, combined with carefully curated training data focused on creative writing and natural interactions. The optional thinking mode and improved steerability set it apart from standard language models.

Q: What are the recommended use cases?

This model excels in creative writing tasks, natural conversations, roleplaying scenarios, and situations requiring detailed reasoning. It's particularly well-suited for applications needing both creative and analytical capabilities while maintaining high-quality, clean outputs.