OpenOrcaxOpenChat-Preview2-13B

Property	Value
Base Model	Llama2-13B
License	Llama2
Training Data	OpenOrca Dataset (GPT-4 augmented)
Primary Paper	Orca: Progressive Learning from Complex Explanation Traces of GPT-4

What is OpenOrcaxOpenChat-Preview2-13B?

OpenOrcaxOpenChat-Preview2-13B represents a significant advancement in open-source language models, achieving remarkable performance with significantly reduced computational requirements. This model is fine-tuned on a curated subset of the OpenOrca dataset, successfully reproducing and surpassing the performance metrics of the original Microsoft Research Orca paper while using less than 10% of the original compute resources.

Implementation Details

The model was trained using 8x A100-80G GPUs for 46 hours, completing 5 epochs of full fine-tuning. It utilizes OpenChat's MultiPack algorithm, achieving 99.85% bin-packing efficiency on the dataset. The training was completed with a commodity cost of approximately $600, making it significantly more efficient than the original Orca implementation.

Surpasses original Orca performance (103%) with <20% dataset size
Achieves #1 position on both HuggingFaceH4 and GPT4ALL leaderboards for 13B models
Utilizes efficient OpenChat Llama2 V1 prompt template

Core Capabilities

Strong performance on BigBench-Hard (0.488 average)
Impressive AGIEval results (0.447 average)
Exceeds performance of larger models like falcon-40b-instruct
Specialized in complex reasoning and expert-level responses

Frequently Asked Questions

Q: What makes this model unique?

This model achieves state-of-the-art performance for its size class while using significantly fewer computational resources than competitors. It demonstrates that efficient training methods and careful dataset curation can lead to superior results without requiring massive computational resources.

Q: What are the recommended use cases?

The model excels at complex reasoning tasks, mathematical problems, and expert-level responses across various domains. It's particularly well-suited for applications requiring strong analytical capabilities and detailed, step-by-step reasoning.