Llama-3.3-70B-Vulpecula-r1

Maintained By
Sao10K

Llama-3.3-70B-Vulpecula-r1

PropertyValue
Base ModelMeta's LLaMA 3.3
Parameters70B
LicenseLlama 3.3 Community License Agreement
Training Data~270M Tokens
AuthorsSao10K & GradientPutri

What is Llama-3.3-70B-Vulpecula-r1?

Llama-3.3-70B-Vulpecula-r1 is an advanced language model built on Meta's LLaMA 3.3 architecture, specifically engineered for enhanced creative writing and reasoning capabilities. The model incorporates both Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) techniques, with inspiration drawn from Deepseek-R1's thinking-based approach.

Implementation Details

The model was trained over 2 epochs using approximately 270M tokens, with 210M being trainable. It employs the paged_ademamix_8bit optimizer with a cosine learning rate scheduler, utilizing a learning rate of 0.00002 and weight decay of 0.01. The training configuration includes a global batch size of 32, achieved through 4 GPUs with batch size 2 and gradient accumulation steps of 4.

  • Specialized thinking mode activated via prefix
  • Enhanced steerability and instruct-roleplay capabilities
  • Implements Llama-3-Instruct formatting
  • Optimized sampling parameters with temperature 0.75 and min_p 0.1

Core Capabilities

  • Advanced creative writing and storytelling
  • Improved reasoning and thinking processes
  • Natural chat and roleplaying interactions
  • Enhanced instruction following
  • Clean, high-quality outputs without toxic content

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its integration of thinking-based approaches inspired by Deepseek-R1, combined with carefully curated training data focused on creative writing and natural interactions. The optional thinking mode and improved steerability set it apart from standard language models.

Q: What are the recommended use cases?

This model excels in creative writing tasks, natural conversations, roleplaying scenarios, and situations requiring detailed reasoning. It's particularly well-suited for applications needing both creative and analytical capabilities while maintaining high-quality, clean outputs.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.