MPT-7B-Storywriter-GGML

Property	Value
Parameter Count	6.7B
License	Apache 2.0
Context Length	65,536 tokens
Architecture	Modified decoder-only transformer
Base Model	MPT-7B

What is MPT-7B-Storywriter-GGML?

MPT-7B-Storywriter-GGML is a GGML-quantized version of MosaicML's story-focused language model, specifically optimized for CPU inference. It's designed for reading and writing fictional stories with exceptionally long context lengths, supporting up to 65k tokens and beyond. The model has been converted to various quantization levels (4-bit, 5-bit, and 8-bit) to accommodate different hardware capabilities and performance requirements.

Implementation Details

The model incorporates several advanced technical features, including FlashAttention for efficient computation, ALiBi (Attention with Linear Biases) for position encoding, and architecture modifications such as QK LayerNorm. It uses the EleutherAI/gpt-neox-20b tokenizer and has been fine-tuned on a curated fiction subset of the books3 dataset.

Multiple quantization options ranging from 4.21GB to 7.58GB file sizes
Compatible with KoboldCpp, ctransformers, GPT4All-UI, and rustformers' llm
Supports extrapolation beyond training context length through ALiBi
32 layers with 32 attention heads and 4096 dimensional embeddings

Core Capabilities

Long-form story generation with coherent narratives
Extended context handling up to 84k tokens
Efficient CPU inference with various quantization options
Creative writing and story continuation
Memory-efficient operation with different RAM requirements based on quantization

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its extraordinary context length capability and specific optimization for story writing tasks, combined with efficient GGML quantization for CPU deployment.

Q: What are the recommended use cases?

The model excels at creative writing tasks, story continuation, and handling long-form narrative content. It's particularly suitable for applications requiring extended context understanding and generation.