RWKV-6 World

Property	Value
License	Apache 2.0
Research Paper	arXiv:2404.05892
Supported Languages	12 (including English, Chinese, French, Spanish, etc.)
Training Data Size	1.42T tokens (v2.1)

What is rwkv-6-world?

RWKV-6 World is an advanced language model trained on a diverse collection of datasets including SlimPajama, The Pile, StarCoder, and OSCAR. It represents a significant evolution in the RWKV architecture, achieving impressive performance with 7B parameters and scoring 54.2% on MMLU benchmarks.

Implementation Details

The model is implemented using PyTorch and requires the rwkv pip package (version 0.8.24+) for inference. It features a specialized training composition of 70% English, 15% multilingual content, and 15% code, making it particularly versatile for various applications.

Trained on multiple high-quality datasets including Wikipedia and ChatGPT data
Supports 12 different languages including major Asian and European languages
Implements specialized tokenizer (rwkv_vocab_v20230424)
Optimized for both chat and QA tasks

Core Capabilities

Multi-language text generation and understanding
Code generation and analysis
Question-answering capabilities
Chat-based interactions with formatted prompting
High MMLU performance (54.2% for 7B v3 model)

Frequently Asked Questions

Q: What makes this model unique?

RWKV-6 World stands out for its efficient architecture that combines the benefits of transformers with RNN-like characteristics, while supporting an impressive range of 12 languages and achieving strong performance on benchmark tests.

Q: What are the recommended use cases?

The model is well-suited for chat applications, question-answering systems, and code generation tasks. It can be effectively used with specific prompt formats for chat and QA scenarios, making it versatile for both general conversation and specialized tasks.

rwkv-6-world