Skywork-Reward-Gemma-2-27B-v0.2

Maintained By
Skywork

Skywork-Reward-Gemma-2-27B-v0.2

PropertyValue
Parameter Count27.2B
Model TypeText Classification
ArchitectureGemma-2 Base
PaperSkywork-Reward: Bag of Tricks for Reward Modeling in LLMs
LicenseSkywork Community License

What is Skywork-Reward-Gemma-2-27B-v0.2?

Skywork-Reward-Gemma-2-27B-v0.2 is a state-of-the-art reward model built on Google's Gemma-2-27b-it architecture. It's designed to evaluate and score text responses, trained on a carefully curated dataset of 80K high-quality preference pairs. The model currently ranks first on the RewardBench leaderboard with a remarkable score of 94.3.

Implementation Details

The model utilizes BF16 precision and requires either flash_attention_2 or eager implementation for optimal performance. It's trained on the Skywork Reward Data Collection, which includes data from multiple high-quality sources like HelpSteer2, OffsetBias, WildGuard, and the Magpie DPO series.

  • Specialized scoring mechanism for preference evaluation
  • Optimized for complex scenarios including mathematics, coding, and safety
  • Implements advanced data curation techniques for balanced domain coverage

Core Capabilities

  • Superior performance in chat evaluation (96.1% accuracy)
  • Excellent reasoning capabilities (98.1% accuracy)
  • Strong safety evaluation metrics (93.0% accuracy)
  • Robust handling of challenging conversational scenarios (89.9% accuracy in Chat Hard)

Frequently Asked Questions

Q: What makes this model unique?

The model achieves state-of-the-art performance using only 80K carefully curated training pairs, demonstrating that high-quality data curation can outperform larger but less refined datasets. It's particularly notable for its balanced performance across different evaluation domains.

Q: What are the recommended use cases?

The model is ideal for evaluating AI-generated responses, particularly in scenarios requiring complex reasoning, safety assessment, and quality judgment of conversational outputs. It's especially useful for researchers and developers working on AI alignment and quality assessment.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.