Sky-T1-32B-Preview

Property	Value
Parameter Count	32 Billion
Base Model	Qwen2.5-32B-Instruct
Training Data	17K verified responses
Model URL	HuggingFace
Developer	NovaSky Team, UC Berkeley

What is Sky-T1-32B-Preview?

Sky-T1-32B-Preview is a sophisticated reasoning model developed by the NovaSky Team at UC Berkeley's Sky Computing Lab. Built upon Qwen2.5-32B-Instruct, this model has been fine-tuned with 17,000 verified correct responses, focusing particularly on coding and mathematical reasoning capabilities. The model demonstrates performance comparable to o1-preview across various benchmarks.

Implementation Details

The model was trained using Llama-Factory with DeepSpeed Zero-3 Offload on 8 H100 GPUs, completing training in 19 hours. The training process utilized a batch size of 96 and incorporated supervised fine-tuning techniques.

Achieves 82.4% on Math500 benchmark
Scores 43.3% on AIME2024
Demonstrates strong performance in LiveCodeBench with 86.3% on Easy, 56.8% on Medium, and 17.9% on Hard tasks
Shows robust scientific reasoning with 56.8% accuracy on GPQA-Diamond

Core Capabilities

Advanced mathematical problem-solving
Strong coding abilities across different difficulty levels
Scientific reasoning and analysis
Competitive performance with leading models in the field

Frequently Asked Questions

Q: What makes this model unique?

Sky-T1-32B-Preview stands out for its focused training on verified correct responses, particularly in mathematics and coding. It achieves performance comparable to o1-preview while being fully open-source and trained with a relatively modest budget of $450.

Q: What are the recommended use cases?

The model is particularly well-suited for mathematical problem-solving, coding tasks, and scientific reasoning applications. It performs especially well on complex mathematical challenges and coding problems across various difficulty levels.