π0.5: a Vision-Language-Action Model with Open-World Generalization

A robot policy that co-trains on web data and many robots — and cleans a kitchen it has never seen.

Physical Intelligence · arXiv 2025 · Model Architectures. Read the paper ↗

A free, interactive, animated visual explainer of π0.5: a Vision-Language-Action Model with Open-World Generalization — every exhibit computed from the real formulas, with verbatim quotes from the source.

Questions

What is π0.5: a Vision-Language-Action Model with Open-World Generalization?: A robot policy that co-trains on web data and many robots — and cleans a kitchen it has never seen.
Who published π0.5: a Vision-Language-Action Model with Open-World Generalization, and where?: Physical Intelligence — arXiv 2025 (arXiv:2504.16054).
Where can I find a visual explainer of π0.5: a Vision-Language-Action Model with Open-World Generalization?: Right here — a free, interactive, animated walkthrough of the whole paper, with exhibits computed from the real formulas and verbatim quotes from the source.

π0.5: a Vision-Language-Action Model with Open-World Generalization

Questions

Related explainers