notus-7b-v1

Maintained By
argilla

Notus-7b-v1

PropertyValue
Parameter Count7.24B
Model TypeGPT-like DPO fine-tuned
Base Modelzephyr-7b-sft-full
LicenseMIT
Training DataUltraFeedback (binarized preferences)

What is notus-7b-v1?

Notus-7b-v1 is an advanced language model that builds upon the Zephyr architecture, fine-tuned using Direct Preference Optimization (DPO) with a carefully curated version of the UltraFeedback dataset. This model represents a significant advancement in the 7B parameter space, notably outperforming Zephyr-7B-beta and matching Claude 2 on various benchmarks.

Implementation Details

The model leverages a data-first approach, utilizing binarized preference ratings instead of traditional critique scores. It was trained on 8 A100 40GB GPUs and implements the same prompt template as Zephyr-7b-beta for consistency and compatibility.

  • Achieves 91.42% win rate on AlpacaEval
  • Scores 7.30 on MT-Bench
  • Outperforms base model on multiple academic benchmarks
  • Implements BF16 precision for optimal performance

Core Capabilities

  • Strong performance in conversational AI tasks
  • Enhanced reasoning capabilities (64.59% on ARC challenge)
  • Improved truthfulness compared to base model
  • Superior performance on general knowledge tasks (63.03% on MMLU)

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its data-centric approach to fine-tuning, using binarized preferences from UltraFeedback instead of traditional critique scores, resulting in significantly improved performance across multiple benchmarks.

Q: What are the recommended use cases?

The model excels in chat-like applications and assistant-style interactions, making it ideal for conversational AI, general knowledge tasks, and reasoning challenges. It's particularly well-suited for applications requiring both accuracy and natural conversation flow.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.