RWKV 7B World Novel 128k
Property | Value |
---|---|
Model Size | 7B parameters |
Context Length | 128,000 tokens |
License | Apache-2.0 |
Training Hardware | 4x A800 GPUs |
Training Duration | 40 hours |
What is rwkv-7B-world-novel-128k?
The rwkv-7B-world-novel-128k is a groundbreaking language model that achieved a milestone as the first 128k context model based on the RWKV architecture. Released on August 10, 2023, it specializes in novel writing and features exceptional multi-language capabilities through its RWKV world tokenizer implementation.
Implementation Details
The model was trained on a diverse dataset combination including instruction datasets, Chinese web novels, and traditional wuxia literature. The training process involved 1.3B tokens and utilized 4 A800 GPUs over 40 hours. The model can be run with 16GB VRAM in FP16 mode or 8GB VRAM in FP16i8 mode.
- Implements RWKV world tokenizer for 1:1 word-to-token ratio
- Supports 128k context window for long-form content processing
- Trained using TrainChatGalRWKV repository
- Optimized for both creative and precise outputs through temperature adjustment
Core Capabilities
- Long-context summarization (tested up to 85k tokens)
- Multi-language support with efficient tokenization
- Novel and creative writing generation
- Flexible temperature settings (0.1-0.2 for precise answers, 1-2.x for creative content)
Frequently Asked Questions
Q: What makes this model unique?
This model stands out as the first RWKV architecture to achieve a 128k context window, combined with its efficient 1:1 tokenization ratio for multiple languages. This makes it particularly suitable for long-form content generation and processing.
Q: What are the recommended use cases?
The model excels in novel writing, long-form content summarization, and creative writing tasks. It's particularly effective for projects requiring extensive context understanding and multi-language capabilities.