MythoMax-L2-13B-GGML

MythoMax-L2-13B-GGML

TheBloke

MythoMax-L2-13B-GGML is a GGML-quantized variant of MythoMax L2 13B, optimized for CPU+GPU inference with various quantization options from 2-bit to 8-bit precision.

PropertyValue
Base ModelGryphe/MythoMax-L2-13b
Model TypeLLaMA Architecture
LicenseLLaMA 2
Quantization Options2-bit to 8-bit

What is MythoMax-L2-13B-GGML?

MythoMax-L2-13B-GGML is a quantized version of the MythoMax L2 13B model, specifically optimized for CPU and GPU inference using the GGML format. It represents a sophisticated merge between MythoLogic-L2 and Huginn models, utilizing an experimental tensor type merge technique for enhanced performance in both roleplay and story writing tasks.

Implementation Details

The model offers multiple quantization levels ranging from 2-bit to 8-bit precision, with file sizes varying from 5.51GB to 13.79GB. Each quantization level provides different trade-offs between model size, RAM usage, and inference quality. The implementation uses various k-quant methods for optimal performance across different use cases.

  • Supports multiple quantization formats (q2_K through q8_0)
  • Optimized tensor distribution for enhanced coherency
  • Customized prompt template for optimal interaction
  • Compatible with llama.cpp and various UI implementations

Core Capabilities

  • Advanced roleplay and character interaction
  • High-quality story writing and narrative generation
  • Efficient CPU+GPU inference with various RAM/VRAM configurations
  • Support for context lengths up to 4096 tokens with RoPE scaling

Frequently Asked Questions

Q: What makes this model unique?

The model utilizes a unique tensor-type merge technique where 363 tensors have individual ratios applied, resulting in superior performance in both comprehension and generation tasks. It effectively combines MythoLogic-L2's robust understanding with Huginn's writing capabilities.

Q: What are the recommended use cases?

The model excels in roleplay scenarios and creative writing tasks. It's particularly well-suited for applications requiring both strong comprehension and coherent output generation, with various quantization options allowing deployment across different hardware configurations.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026