wham

Maintained By
microsoft

WHAM (World and Human Action Model)

PropertyValue
DeveloperMicrosoft Research
Parameters200M and 1.6B versions
Training Data500,000 Bleeding Edge games
LicenseMicrosoft Research License
ArchitectureDecoder-only transformer with VQ-GAN

What is WHAM?

WHAM is an advanced generative AI model developed by Microsoft Research's Game Intelligence group in collaboration with TaiX and Ninja Theory. It's specifically designed to generate gameplay sequences, combining both visual elements and controller actions from the game Bleeding Edge. The model can process and generate consistent game sequences while maintaining an understanding of 3D environment structure and temporal gameplay elements.

Implementation Details

The model architecture consists of two main components: an encoder-decoder VQ-GAN for handling game visuals and a transformer backbone for next-token prediction. It was trained on approximately 500,000 Bleeding Edge games, equivalent to over 7 years of continuous human gameplay, using 98 H100 GPUs over 5 days.

  • Context length: 10 observation-action pairs (5560 tokens)
  • Image resolution: 300px x 180px
  • Training data: 1 billion observation-action pairs at 10Hz
  • Available versions: 200M parameters (3.7GB) and 1.6B parameters (18.9GB)

Core Capabilities

  • World Modeling: Predicts visuals based on starting state and action sequence
  • Behavior Policy: Generates controller actions based on visual input
  • Full Generation: Creates both visuals and controller actions simultaneously
  • Consistent and persistent game sequence generation

Frequently Asked Questions

Q: What makes this model unique?

WHAM is unique in its ability to generate both visual and controller action sequences for gameplay, maintaining consistency and physical accuracy within the game environment. It's one of the first models to demonstrate effective world modeling for complex 3D game environments.

Q: What are the recommended use cases?

The model is specifically designed for academic research purposes and game development exploration. It's particularly useful for studying gameplay patterns, testing game scenarios, and creative iteration in game development within the context of Bleeding Edge.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.