🤖 FlagOS-Robo is built upon the unified and open-source AI system software stack, FlagOS, which supports various AI chips. It serves as an integrated training and inference framework for AI models used in robots🤖 , so-called Embodied Intelligence. It can be deployed across diverse scenarios, ranging from edge to cloud. Being portable across various chip models, it enables efficient training, inference, and deployment for both Vision Language Models (VLMs) and Vision Language Action (VLA) models. Here, VLMs usually act as the brain🧠 for task planning, while VLA models act as the cerebellum to output actions for robot control🦾.
FlagOS-Robo provides a powerful computational foundation and systematic support for cutting-edge researches and industrial applications in embodied intelligence, accelerating innovations and real-world deployments of intelligent agents.
- FlagScale as users' entrypoint supports robot related AI model training and inference, including Pi-0, Pi-0.5, RoboBrain2, , RoboBrainX0. RoboBrain2.5 and RoboBrainX0.5 will be released soon.
- FlagOS-Robo supports RoboOS-based cross-embodiment collaboration, ensuring compatibility with different data formats, efficient edge-cloud coordination, and real-machine evaluation.
| Models | Type | Checkpoint | Train | Inference | Serve | Evaluate |
|---|---|---|---|---|---|---|
| PI0 | VLA | Huggingface | ✅︎ Guide | ✅︎ Guide | ✅ Guide | ❌ |
| PI0.5 | VLA | Huggingface | ✅︎ Guide | ✅ Guide | ✅ Guide | ❌ |
| RoboBrain-2.0 | VLM | Huggingface | ✅︎ Guide | ✅Guide | ✅Guide | ✅ Guide |
| RoboBrain-2.5 | VLM | ❌ | ✅︎ Guide | ✅Guide | ✅Guide | ✅ Guide |
| RoboBrain-X0 | VLA | Huggingface | ✅︎ Guide | ❌ | ✅ Guide | ❌ |