OpenOrca-Platypus2-13B-GGML

Property	Value
Model Type	GGML Quantized LLM
Base Model	OpenOrca-Platypus2-13B
License	Llama 2
Paper	Platypus Paper

What is OpenOrca-Platypus2-13B-GGML?

OpenOrca-Platypus2-13B-GGML is a quantized version of the powerful OpenOrca-Platypus2-13B model, optimized for CPU and GPU inference using the GGML format. This model represents a strategic merger between Platypus2-13B and OpenOrcaxOpenChat-Preview2-13B, combining STEM capabilities with general instruction-following abilities.

Implementation Details

The model is available in multiple quantization formats ranging from 2-bit to 8-bit precision, offering different tradeoffs between model size, performance, and resource usage. The quantized versions range from 5.74GB (q2_K) to 13.83GB (q8_0) in size.

Multiple quantization options (q2_K through q8_0)
Optimized for both CPU and GPU inference
Supports context length similar to base Llama 2
Uses Alpaca-InstructOnly prompt format

Core Capabilities

Strong performance on MMLU (59.5%), ARC (62.88%), and HellaSwag (83.19%)
Enhanced STEM and logical reasoning capabilities
Efficient resource usage through various quantization options
Supports both CPU and GPU acceleration

Frequently Asked Questions

Q: What makes this model unique?

This model combines the STEM and logical reasoning capabilities of Platypus2 with the general instruction-following abilities of OpenOrcaxOpenChat, all while being optimized for efficient deployment through GGML quantization.

Q: What are the recommended use cases?

The model excels in STEM-related tasks, logical reasoning, and general instruction-following scenarios. Different quantization options allow deployment on various hardware configurations, from resource-constrained environments to high-performance systems.