fast

fast

physical-intelligence

FAST is an efficient action tokenizer for robotics that converts robot action sequences into discrete tokens for vision-language-action models

PropertyValue
Authorphysical-intelligence
Model TypeAction Tokenizer
SourceHuggingFace

What is FAST?

FAST (Efficient Action Tokenization for Vision-Language-Action Models) is a revolutionary tokenizer designed specifically for robotics applications. It efficiently converts sequences of robot actions into discrete, dense tokens that can be used to train autoregressive Vision-Language-Action (VLA) models. The system includes FAST+, a universal action tokenizer trained on 1 million real robot action sequences.

Implementation Details

FAST is implemented as a HuggingFace AutoProcessor, making it easily accessible and integrable into existing workflows. The system operates on 1-second action "chunks" that are pre-normalized to a range of [-1...1] using quantile normalization. It supports batched inference for both encoding and decoding operations.

  • Simple installation through pip (transformers and scipy packages)
  • Supports batch processing of action sequences
  • Automatic dimension handling during decoding
  • Custom tokenizer training capabilities

Core Capabilities

  • Efficient conversion of continuous action sequences to discrete tokens
  • Universal tokenization across different robot setups
  • Quick training of custom tokenizers on specific datasets
  • Seamless integration with HuggingFace ecosystem
  • Support for variable-length action sequences

Frequently Asked Questions

Q: What makes this model unique?

FAST+ is unique in its ability to handle a wide range of robot setups, action dimensions, and control frequencies through a universal tokenization approach. It's been trained on 1M real robot action sequences, making it robust and versatile.

Q: What are the recommended use cases?

The model is ideal for robotics applications requiring discrete representation of continuous action sequences, particularly in vision-language-action systems. It's especially useful for training autoregressive models and standardizing robot action data across different platforms.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026