OpenOrca-Platypus2-13B-GGML
Property | Value |
---|---|
Model Type | GGML Quantized LLM |
Base Model | OpenOrca-Platypus2-13B |
License | Llama 2 |
Paper | Platypus Paper |
What is OpenOrca-Platypus2-13B-GGML?
OpenOrca-Platypus2-13B-GGML is a quantized version of the powerful OpenOrca-Platypus2-13B model, optimized for CPU and GPU inference using the GGML format. This model represents a strategic merger between Platypus2-13B and OpenOrcaxOpenChat-Preview2-13B, combining STEM capabilities with general instruction-following abilities.
Implementation Details
The model is available in multiple quantization formats ranging from 2-bit to 8-bit precision, offering different tradeoffs between model size, performance, and resource usage. The quantized versions range from 5.74GB (q2_K) to 13.83GB (q8_0) in size.
- Multiple quantization options (q2_K through q8_0)
- Optimized for both CPU and GPU inference
- Supports context length similar to base Llama 2
- Uses Alpaca-InstructOnly prompt format
Core Capabilities
- Strong performance on MMLU (59.5%), ARC (62.88%), and HellaSwag (83.19%)
- Enhanced STEM and logical reasoning capabilities
- Efficient resource usage through various quantization options
- Supports both CPU and GPU acceleration
Frequently Asked Questions
Q: What makes this model unique?
This model combines the STEM and logical reasoning capabilities of Platypus2 with the general instruction-following abilities of OpenOrcaxOpenChat, all while being optimized for efficient deployment through GGML quantization.
Q: What are the recommended use cases?
The model excels in STEM-related tasks, logical reasoning, and general instruction-following scenarios. Different quantization options allow deployment on various hardware configurations, from resource-constrained environments to high-performance systems.