WHAM (World and Human Action Model)

Property	Value
Developer	Microsoft Research
Parameters	200M and 1.6B versions
Training Data	500,000 Bleeding Edge games
License	Microsoft Research License
Architecture	Decoder-only transformer with VQ-GAN

What is WHAM?

WHAM is an advanced generative AI model developed by Microsoft Research's Game Intelligence group in collaboration with TaiX and Ninja Theory. It's specifically designed to generate gameplay sequences, combining both visual elements and controller actions from the game Bleeding Edge. The model can process and generate consistent game sequences while maintaining an understanding of 3D environment structure and temporal gameplay elements.

Implementation Details

The model architecture consists of two main components: an encoder-decoder VQ-GAN for handling game visuals and a transformer backbone for next-token prediction. It was trained on approximately 500,000 Bleeding Edge games, equivalent to over 7 years of continuous human gameplay, using 98 H100 GPUs over 5 days.

Context length: 10 observation-action pairs (5560 tokens)
Image resolution: 300px x 180px
Training data: 1 billion observation-action pairs at 10Hz
Available versions: 200M parameters (3.7GB) and 1.6B parameters (18.9GB)

Core Capabilities

World Modeling: Predicts visuals based on starting state and action sequence
Behavior Policy: Generates controller actions based on visual input
Full Generation: Creates both visuals and controller actions simultaneously
Consistent and persistent game sequence generation

Frequently Asked Questions

Q: What makes this model unique?

WHAM is unique in its ability to generate both visual and controller action sequences for gameplay, maintaining consistency and physical accuracy within the game environment. It's one of the first models to demonstrate effective world modeling for complex 3D game environments.

Q: What are the recommended use cases?

The model is specifically designed for academic research purposes and game development exploration. It's particularly useful for studying gameplay patterns, testing game scenarios, and creative iteration in game development within the context of Bleeding Edge.

wham