Sky-T1-32B-Preview
Property | Value |
---|---|
Parameter Count | 32 Billion |
Base Model | Qwen2.5-32B-Instruct |
Training Data | 17K verified responses |
Model URL | HuggingFace |
Developer | NovaSky Team, UC Berkeley |
What is Sky-T1-32B-Preview?
Sky-T1-32B-Preview is a sophisticated reasoning model developed by the NovaSky Team at UC Berkeley's Sky Computing Lab. Built upon Qwen2.5-32B-Instruct, this model has been fine-tuned with 17,000 verified correct responses, focusing particularly on coding and mathematical reasoning capabilities. The model demonstrates performance comparable to o1-preview across various benchmarks.
Implementation Details
The model was trained using Llama-Factory with DeepSpeed Zero-3 Offload on 8 H100 GPUs, completing training in 19 hours. The training process utilized a batch size of 96 and incorporated supervised fine-tuning techniques.
- Achieves 82.4% on Math500 benchmark
- Scores 43.3% on AIME2024
- Demonstrates strong performance in LiveCodeBench with 86.3% on Easy, 56.8% on Medium, and 17.9% on Hard tasks
- Shows robust scientific reasoning with 56.8% accuracy on GPQA-Diamond
Core Capabilities
- Advanced mathematical problem-solving
- Strong coding abilities across different difficulty levels
- Scientific reasoning and analysis
- Competitive performance with leading models in the field
Frequently Asked Questions
Q: What makes this model unique?
Sky-T1-32B-Preview stands out for its focused training on verified correct responses, particularly in mathematics and coding. It achieves performance comparable to o1-preview while being fully open-source and trained with a relatively modest budget of $450.
Q: What are the recommended use cases?
The model is particularly well-suited for mathematical problem-solving, coding tasks, and scientific reasoning applications. It performs especially well on complex mathematical challenges and coding problems across various difficulty levels.