MPT-7B-StoryWriter-4bit-128g
Property | Value |
---|---|
Parameter Count | 1.07B |
License | Apache-2.0 |
Context Length | 65k+ tokens |
Architecture | Modified decoder-only transformer |
Paper | ALiBi Paper |
What is mpt-7b-storywriter-4bit-128g?
This is a specialized version of the MPT-7B-StoryWriter model, optimized for 4-bit quantization specifically for use with KoboldAI. The model is designed for reading and writing fictional stories with extraordinarily long context lengths, capable of handling up to 65,000 tokens and potentially extrapolating beyond that using ALiBi technology.
Implementation Details
The model employs a modified decoder-only transformer architecture with several key optimizations. It utilizes FlashAttention for improved efficiency, ALiBi (Attention with Linear Biases) instead of traditional positional embeddings, and operates without biases. The model has been quantized to 4-bit precision while maintaining performance.
- 32 layers with 32 attention heads
- 4096 dimensional model (d_model)
- 50,432 vocabulary size
- Supports sequence lengths up to 65,536 tokens
Core Capabilities
- Long-form story generation and continuation
- Extended context understanding (65k+ tokens)
- Memory-efficient operation through 4-bit quantization
- Commercial usage permitted under Apache-2.0 license
Frequently Asked Questions
Q: What makes this model unique?
The model's ability to handle extremely long context lengths (65k+ tokens) through ALiBi technology, combined with 4-bit quantization for efficient deployment, makes it particularly suitable for long-form story generation tasks. The model can maintain coherence across very long sequences, as demonstrated by its ability to process the entire text of "The Great Gatsby" and generate coherent continuations.
Q: What are the recommended use cases?
This model is specifically optimized for creative writing applications, particularly long-form fiction. It excels at understanding and generating narrative content, making it ideal for story continuation, creative writing assistance, and long-form content generation.