DeepMind’s Gemini Robotics On-Device brings advanced AI to local robots

Google DeepMind has introduced an efficient, on-device robotics model designed for general-purpose dexterity and rapid task adaptation.
This new offering, known as Gemini Robotics On-Device, is an optimized version of the Gemini Robotics VLA (vision language action) model, which was initially launched in March to integrate Gemini 2.0’s multimodal reasoning into physical applications.
Gemini Robotics On-Device is specifically engineered to operate locally on robotic devices, ensuring robust performance in environments with limited or no network connectivity and providing low-latency inference crucial for sensitive applications.
The model demonstrates strong general-purpose dexterity and task generalization capabilities.
DeepMind is also releasing a Gemini Robotics SDK to facilitate developer evaluation of Gemini Robotics On-Device.
This SDK enables testing within the MuJoCo physics simulator and allows for quick adaptation to new domains with as few as 50 to 100 demonstrations. Developers can gain access to the SDK by enrolling in the trusted tester program.
Model capabilities and performance
Gemini Robotics On-Device serves as a foundational robotics model for bi-arm robots, designed with minimal computational resource requirements. It expands upon the task generalization and dexterity capabilities of the broader Gemini Robotics platform, offering:
- Rapid experimentation with dexterous manipulation.
- Adaptability to new tasks through fine-tuning for improved performance.
- Optimization for local operation with low-latency inference.
The model exhibits robust visual, semantic, and behavioral generalization across various testing scenarios. It can follow natural language instructions and execute highly dexterous tasks, such as unzipping bags or folding clothes, directly on the robot.
Evaluations indicate strong generalization performance in entirely local operations, outperforming previous on-device models and other alternatives in challenging out-of-distribution tasks and complex multi-step instructions.
For developers requiring state-of-the-art results without on-device limitations, the flagship Gemini Robotics model remains available.
Adaptability and generalization across embodiments
Gemini Robotics On-Device is the first VLA model from DeepMind available for fine-tuning. While many tasks can be performed out-of-the-box, developers have the option to adapt the model for enhanced performance in specific applications.
The model demonstrates rapid adaptation to new tasks, requiring fewer than 100 demonstrations, showcasing its ability to generalize foundational knowledge.
The model has been tested on seven dexterous manipulation tasks, including zipping a lunch-box, drawing a card, and pouring salad dressing, where it outperformed existing on-device VLA models after fine-tuning.
Furthermore, DeepMind has successfully adapted Gemini Robotics On-Device to various robot embodiments. Although initially trained for Aloha robots, it has been adapted to a bi-arm Franka FR3 robot and Apptronik’s Apollo humanoid robot.
On the Franka, it performs general-purpose instruction following, handling unseen objects and scenes, and executing precision-demanding tasks like folding a dress or industrial belt assembly.
On the Apollo humanoid, the same generalist model can follow natural language instructions and manipulate diverse objects, including previously unseen ones, in a generalized manner.
Responsible development and safety
DeepMind emphasizes responsible development for all Gemini Robotics models, adhering to its AI Principles and employing a holistic safety approach encompassing both semantic and physical safety.
Semantic and content safety are managed through the Live API, while models interface with low-level safety-critical controllers for action execution.
DeepMind recommends evaluating the end-to-end system using its semantic safety benchmark and conducting red-teaming exercises at all levels to identify potential safety vulnerabilities.
The company’s Responsible Development & Innovation (ReDI) team analyzes the real-world impact of Gemini Robotics models to maximize societal benefits and minimize risks.
The Responsibility & Safety Council (RSC) reviews these assessments, providing feedback for model development to further enhance benefits and mitigate risks. Gemini
Robotics On-Device is initially being released to a select group of trusted testers to gather feedback and gain a deeper understanding of its usage and safety profile.
Accelerating robotics innovation
Gemini Robotics On-Device represents a significant step towards making powerful robotics models more accessible and adaptable. DeepMind’s on-device solution aims to help the robotics community address critical latency and connectivity challenges.
The Gemini Robotics SDK is expected to further accelerate innovation by allowing developers to tailor the model to their specific needs. Interested parties can sign up for model and SDK access through the trusted tester program.
DeepMind anticipates the robotics community’s contributions with these new tools as it continues to advance the integration of AI into the physical world.