MetaStone-L1-7B

Property	Value
Parameter Count	7 Billion
Base Model	DeepSeek-R1-Distill-Qwen-7B
Training Method	GRPO
Paper	Graph-Based Synthetic Data Pipeline
Model Link	Hugging Face

What is MetaStone-L1-7B?

MetaStone-L1-7B is a specialized lite reasoning model designed to excel in complex downstream tasks, particularly in mathematics and coding. Built on DeepSeek-R1-Distill-Qwen-7B architecture, it achieves state-of-the-art results among parallel-level models and demonstrates performance comparable to prominent API models like Claude-3.5-Sonnet-1022 and GPT4o-0513.

Implementation Details

The model implementation requires the latest version of transformers (4.48.3) and follows specific optimization guidelines for maximum performance. It utilizes a unique approach with think tags and standardized output formats for different task types.

Enhanced thoughtful output using think tags
Standardized input format with user and assistant markers
Optimized temperature (0.6) and top sampling (0.95)
Maximum generation length of 32k tokens

Core Capabilities

Advanced mathematical reasoning with step-by-step solutions
High-performance code generation and problem-solving
Structured output formatting for math and coding tasks
Large context window handling

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its specialized focus on reasoning tasks, particularly in mathematics and coding, achieving SOTA results through its graph-based synthetic data pipeline and optimization techniques.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, coding tasks, and situations requiring structured reasoning. It's particularly effective when used with standardized prompts for math problems (using \boxed{} for answers) and code generation (using specific formatting guidelines).

MetaStone-L1-7B

MetaStone-L1-7B

What is MetaStone-L1-7B?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models