RWKV 7B World Novel 128k

Property	Value
Model Size	7B parameters
Context Length	128,000 tokens
License	Apache-2.0
Training Hardware	4x A800 GPUs
Training Duration	40 hours

What is rwkv-7B-world-novel-128k?

The rwkv-7B-world-novel-128k is a groundbreaking language model that achieved a milestone as the first 128k context model based on the RWKV architecture. Released on August 10, 2023, it specializes in novel writing and features exceptional multi-language capabilities through its RWKV world tokenizer implementation.

Implementation Details

The model was trained on a diverse dataset combination including instruction datasets, Chinese web novels, and traditional wuxia literature. The training process involved 1.3B tokens and utilized 4 A800 GPUs over 40 hours. The model can be run with 16GB VRAM in FP16 mode or 8GB VRAM in FP16i8 mode.

Implements RWKV world tokenizer for 1:1 word-to-token ratio
Supports 128k context window for long-form content processing
Trained using TrainChatGalRWKV repository
Optimized for both creative and precise outputs through temperature adjustment

Core Capabilities

Long-context summarization (tested up to 85k tokens)
Multi-language support with efficient tokenization
Novel and creative writing generation
Flexible temperature settings (0.1-0.2 for precise answers, 1-2.x for creative content)

Frequently Asked Questions

Q: What makes this model unique?

This model stands out as the first RWKV architecture to achieve a 128k context window, combined with its efficient 1:1 tokenization ratio for multiple languages. This makes it particularly suitable for long-form content generation and processing.

Q: What are the recommended use cases?

The model excels in novel writing, long-form content summarization, and creative writing tasks. It's particularly effective for projects requiring extensive context understanding and multi-language capabilities.