Control Robots Remotely with Apple Vision Pro, NVIDIA: 'Human-Machine Integration is Not Difficult'
Jensen Huang said, "The next wave of AI is robots, and one of the most exciting developments is humanoid robots." Today, Project GR00T has taken an important step forward.
Yesterday, NVIDIA founder Jensen Huang spoke about their general-purpose humanoid robot model, "Project GR00T," during his SIGGRAPH 2024 keynote speech. The model has received a series of functional updates.
Yuke Zhu, an assistant professor at the University of Texas at Austin and a senior research scientist at NVIDIA, tweeted a video demonstrating how NVIDIA integrated the large-scale household robot simulation training frameworks RoboCasa and MimicGen into the NVIDIA Omniverse platform and the Isaac robot development platform.
The video covers NVIDIA's three computing platforms—AI, Omniverse, and Jetson Thor—leveraging them to simplify and accelerate developer workflows. Through the combined capabilities of these computing platforms, we are poised to enter an era of humanoid robots driven by physical AI.
Among the highlights is that developers can use Apple Vision Pro to remotely control humanoid robots to perform tasks.
Meanwhile, another senior research scientist at NVIDIA, Jim Fan, stated that the updates to Project GR00T are exhilarating. NVIDIA uses a systematic approach to scale robot data, addressing some of the most challenging problems in the robotics field.
The idea is simple: humans collect demonstration data on real robots, and NVIDIA scales this data a thousandfold or more in simulations. With GPU-accelerated simulations, people can now use computational power to replace the time-consuming, labor-intensive, and costly process of human data collection.
He mentioned that not long ago, he believed remote operation was fundamentally unscalable because, in the atomic world, we are always constrained by the limit of 24 hours per robot per day. The new synthetic data pipeline used in GR00T breaks this limitation in the digital world.
Regarding NVIDIA's latest advancements in humanoid robotics, one netizen commented that Apple Vision Pro has found its coolest use case.
NVIDIA begins to lead the next wave: physical AI.
NVIDIA also detailed the technical process of accelerating humanoid robots in a blog post. Here is the full content:
To accelerate the development of humanoid robots worldwide, NVIDIA announced a set of services, models, and computing platforms for leading robot manufacturers, AI model developers, and software makers globally, enabling them to develop, train, and build the next generation of humanoid robots.
This suite includes the new NVIDIA NIM microservices and frameworks for robot simulation and learning, the NVIDIA OSMO orchestration service for running multi-stage robot workloads, and the remote operation workflow supporting AI and simulation, allowing developers to train robots with minimal human demonstration data.
Jensen Huang stated, "The next wave of AI is robots, and one of the most exciting developments is humanoid robots. We are advancing the entire NVIDIA robot stack, opening it up to humanoid robot developers and companies worldwide, enabling them to use the platforms, accelerated libraries, and AI models that best meet their needs."
Accelerating development with NVIDIA NIM and OSMO.
NIM microservices offer pre-built containers powered by NVIDIA inference software, reducing deployment time from weeks to minutes.
Two new AI microservices will allow robotics experts to enhance the generation of physical AI simulation workflows in NVIDIA Isaac Sim.
The MimicGen NIM microservice generates synthetic motion data from remotely recorded data from spatial computing devices like Apple Vision Pro. The Robocasa NIM microservice generates robot tasks and simulation environments in OpenUSD.
The cloud-native managed service NVIDIA OSMO is now available, allowing users to coordinate and scale complex robot development workflows across distributed computing resources, whether on-premises or in the cloud. OSMO significantly simplifies robot training and simulation workflows, reducing deployment and development cycles from months to under a week.
Providing advanced data capture workflows for humanoid robot developers.
Training the foundational models behind humanoid robots requires vast amounts of data. One way to obtain human demonstration data is through remote operation, but this method is becoming increasingly expensive and time-consuming.
By showcasing the NVIDIA AI and Omniverse remote operation reference workflow at the SIGGRAPH computer graphics conference, researchers and AI developers can generate large amounts of synthetic motion and perception data from a minimal amount of remotely captured human demonstrations.
First, developers capture a small amount of remote demonstration using Apple Vision Pro. Then, they simulate recordings in NVIDIA Isaac Sim and use the MimicGen NIM microservice to generate synthetic datasets from the recordings.
Developers use both real and synthetic data to train the Project GR00T humanoid robot foundational model, saving significant time and reducing costs. They then use the Robocasa NIM microservice in Isaac Lab, a robot learning framework, to generate experiences for retraining the robot model. Throughout the workflow, NVIDIA OSMO seamlessly allocates computing tasks to different resources, reducing developers' management workload by weeks.
Expanding access to NVIDIA humanoid robot developer technologies.
NVIDIA offers three computing platforms to simplify humanoid robot development: the NVIDIA AI supercomputer for training models; NVIDIA Isaac Sim, built on Omniverse, for robots to learn and refine skills in a simulated world; and the NVIDIA Jetson Thor humanoid robot computer for running models. Developers can access and use all or part of these platforms according to their specific needs.
Through the new NVIDIA Humanoid Robot Developer Program, developers can gain early access to new products and the latest versions of NVIDIA Isaac Sim, NVIDIA Isaac Lab, Jetson Thor, and the Project GR00T general humanoid robot foundational model.
1x, Boston Dynamics, ByteDance, Field AI, Figure, Fourier, Galbot, LimX Dynamics, Mentee, Neura Robotics, RobotEra, and Skild AI are among the first companies to join the early access program.
Developers can now join the NVIDIA Humanoid Robot Developer Program to access NVIDIA OSMO and Isaac Lab and soon gain access to NVIDIA NIM microservices.
Blog link: https://nvidianews.nvidia.com/news/nvidia-accelerates-worldwide-humanoid-robotics-development
Follow WriteGo to get the latest AI information