Llama-3-Instruct-8B-SPPO-Iter3

Llama-3-Instruct-8B-SPPO-Iter3

UCLA-AGI

An 8B parameter LLM based on Llama-3, fine-tuned using Self-Play Preference Optimization over 3 iterations. Shows strong performance on instruction-following tasks with 68.28% accuracy on IFEval.

PropertyValue
Parameter Count8.03B
LicenseApache-2.0
Base Modelmeta-llama/Meta-Llama-3-8B-Instruct
Research PaperSelf-Play Preference Optimization

What is Llama-3-Instruct-8B-SPPO-Iter3?

This is an advanced language model developed by UCLA-AGI using Self-Play Preference Optimization (SPPO) methodology. It represents the third iteration of improvements on the Meta-Llama-3-8B-Instruct base model, trained using the UltraFeedback dataset for enhanced instruction-following capabilities.

Implementation Details

The model utilizes a sophisticated training approach with specific hyperparameters including a learning rate of 5e-07, RMSProp optimizer, and linear learning rate scheduling. Training was conducted across 8 devices using DeepSpeed ZeRO-3 optimization.

  • Trained on synthetic datasets derived from openbmb/UltraFeedback
  • Implements three-iteration SPPO methodology
  • Uses BF16 tensor type for efficient computation

Core Capabilities

  • Achieves 68.28% accuracy on IFEval (0-Shot)
  • Shows 29.74% normalized accuracy on BBH (3-Shot)
  • Demonstrates consistent improvement over previous iterations with 39.85% win rate on AlpacaEval
  • Performs well on multiple benchmarks including arc_challenge (65.19%) and hellaswag (80.86%)

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its iterative SPPO training approach, showing progressive improvements across three iterations, particularly in instruction-following tasks and general language understanding.

Q: What are the recommended use cases?

This model is particularly well-suited for instruction-following tasks, general text generation, and applications requiring strong language understanding capabilities in English. It performs especially well in scenarios requiring precise following of instructions.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026