OLMo-2-1124-13B-Instruct-GGUF
Property | Value |
---|---|
Parameter Count | 13.7B |
License | Apache 2.0 |
Format | GGUF |
Language | English |
What is OLMo-2-1124-13B-Instruct-GGUF?
OLMo-2-1124-13B-Instruct-GGUF is a comprehensive quantized version of the Allen AI's OLMo language model, specifically optimized for efficient deployment using the GGUF format. This model offers multiple quantization options ranging from full F16 precision (27.44GB) down to highly compressed IQ2_S (4.59GB) variants, making it adaptable to various hardware configurations and performance requirements.
Implementation Details
The model uses a specialized prompt format: "<|endoftext|><|system|>{system_prompt}<|user|>{prompt}<|assistant|>" and offers multiple quantization levels optimized using llama.cpp's imatrix option. Each variant is carefully balanced between model size and performance, with specific optimizations for different hardware architectures including ARM and AVX inference.
- Multiple quantization options from Q8_0 to IQ2_S
- Specialized versions for ARM processors with different optimization levels
- Embed/output weights variants for enhanced performance
Core Capabilities
- High-quality text generation with configurable precision levels
- Optimized performance on various hardware configurations
- Support for both CPU and GPU inference
- Flexible deployment options based on available system resources
Frequently Asked Questions
Q: What makes this model unique?
The model offers an extensive range of quantization options with specific optimizations for different hardware architectures, making it highly versatile for various deployment scenarios. The implementation includes special considerations for embed/output weights and ARM-specific optimizations.
Q: What are the recommended use cases?
For users prioritizing quality, the Q6_K_L (11.51GB) variant is recommended. For balanced performance, Q4_K_M (8.35GB) is suggested as the default choice. For systems with limited resources, IQ3_XS (5.80GB) offers a good compromise between size and performance.