MetaStone-L1-7B

Maintained By
MetaStoneTec

MetaStone-L1-7B

PropertyValue
Parameter Count7 Billion
Base ModelDeepSeek-R1-Distill-Qwen-7B
Training MethodGRPO
PaperGraph-Based Synthetic Data Pipeline
Model LinkHugging Face

What is MetaStone-L1-7B?

MetaStone-L1-7B is a specialized lite reasoning model designed to excel in complex downstream tasks, particularly in mathematics and coding. Built on DeepSeek-R1-Distill-Qwen-7B architecture, it achieves state-of-the-art results among parallel-level models and demonstrates performance comparable to prominent API models like Claude-3.5-Sonnet-1022 and GPT4o-0513.

Implementation Details

The model implementation requires the latest version of transformers (4.48.3) and follows specific optimization guidelines for maximum performance. It utilizes a unique approach with think tags and standardized output formats for different task types.

  • Enhanced thoughtful output using think tags
  • Standardized input format with user and assistant markers
  • Optimized temperature (0.6) and top sampling (0.95)
  • Maximum generation length of 32k tokens

Core Capabilities

  • Advanced mathematical reasoning with step-by-step solutions
  • High-performance code generation and problem-solving
  • Structured output formatting for math and coding tasks
  • Large context window handling

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its specialized focus on reasoning tasks, particularly in mathematics and coding, achieving SOTA results through its graph-based synthetic data pipeline and optimization techniques.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, coding tasks, and situations requiring structured reasoning. It's particularly effective when used with standardized prompts for math problems (using \boxed{} for answers) and code generation (using specific formatting guidelines).

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.