OpenOrca-Platypus2-13B-GGML

Maintained By
TheBloke

OpenOrca-Platypus2-13B-GGML

PropertyValue
Model TypeGGML Quantized LLM
Base ModelOpenOrca-Platypus2-13B
LicenseLlama 2
PaperPlatypus Paper

What is OpenOrca-Platypus2-13B-GGML?

OpenOrca-Platypus2-13B-GGML is a quantized version of the powerful OpenOrca-Platypus2-13B model, optimized for CPU and GPU inference using the GGML format. This model represents a strategic merger between Platypus2-13B and OpenOrcaxOpenChat-Preview2-13B, combining STEM capabilities with general instruction-following abilities.

Implementation Details

The model is available in multiple quantization formats ranging from 2-bit to 8-bit precision, offering different tradeoffs between model size, performance, and resource usage. The quantized versions range from 5.74GB (q2_K) to 13.83GB (q8_0) in size.

  • Multiple quantization options (q2_K through q8_0)
  • Optimized for both CPU and GPU inference
  • Supports context length similar to base Llama 2
  • Uses Alpaca-InstructOnly prompt format

Core Capabilities

  • Strong performance on MMLU (59.5%), ARC (62.88%), and HellaSwag (83.19%)
  • Enhanced STEM and logical reasoning capabilities
  • Efficient resource usage through various quantization options
  • Supports both CPU and GPU acceleration

Frequently Asked Questions

Q: What makes this model unique?

This model combines the STEM and logical reasoning capabilities of Platypus2 with the general instruction-following abilities of OpenOrcaxOpenChat, all while being optimized for efficient deployment through GGML quantization.

Q: What are the recommended use cases?

The model excels in STEM-related tasks, logical reasoning, and general instruction-following scenarios. Different quantization options allow deployment on various hardware configurations, from resource-constrained environments to high-performance systems.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.