Selene-1-Mini-Llama-3.1-8B

Maintained By
AtlaAI

Selene-1-Mini-Llama-3.1-8B

PropertyValue
DeveloperAtlaAI
Base ModelLlama-3.1-8B
Context Length128K tokens
LanguagesEnglish, German, French, Italian, Portuguese, Hindi, Spanish, Thai
PaperarXiv:2501.17195

What is Selene-1-Mini-Llama-3.1-8B?

Selene-1-Mini-Llama-3.1-8B is a state-of-the-art small language model-as-a-judge (SLMJ) that has been specifically designed for evaluation tasks. Despite its relatively compact size, it achieves performance comparable to models 10 times larger, including outperforming GPT-4 on specialized benchmarks like RewardBench, EvalBiasBench, and AutoJ.

Implementation Details

The model is post-trained from Llama-3.1-8B and has been optimized across various evaluation tasks and scoring criteria. It implements the Llama 3 conversation template and requires proper template application for optimal performance. The model can be easily deployed using Hugging Face Transformers library and supports both CPU and GPU deployment.

  • Built on Llama-3.1-8B architecture
  • Supports 128K context length for comprehensive evaluation
  • Implements structured evaluation outputs
  • Provides qualitative critiques with reasoning

Core Capabilities

  • Absolute scoring evaluations (1-5 scale ratings)
  • Binary classification tasks
  • Pairwise preference analysis
  • Multi-language support for major global languages
  • RAG hallucination detection
  • Structured evaluation outputs with reasoning

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its ability to match the performance of much larger models while maintaining a smaller parameter count of 8B. It's specifically designed for evaluation tasks and achieves state-of-the-art results on multiple benchmarks.

Q: What are the recommended use cases?

The model is ideal for evaluation tasks including response quality assessment, harmlessness evaluation, logical consistency checking, and RAG hallucination detection. It can be used for both absolute scoring and comparative analysis of responses.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.