Summary
Transcript
Additionally, textured bumps on the sole enhance traction to ensure stability across various surfaces. What’s more are the robot’s new actuators, with the A2 actuator providing 50 Newton meters torque and a range of 48 degrees, while its L1 and L4 actuators deliver 150 Newton meters of torque with 195 degrees and 135 degrees of range, respectively. Furthermore, its symmetrical leg design now also simplifies the robot’s manufacturing process through advanced software adaptability. And perhaps most impressive is the Figur2’s extremely realistic hands that now feature rubberized grip bumps, varying finger lengths, and even fully opposable thumbs that demonstrate remarkable dexterity and speed.
Importantly, these hexagonal 3D printed structures enhance safety by preventing pinch points during human-robot interactions. But the video also revealed the Figur2’s head, featuring a display for battery life and intent communication lights. Plus, the neck now features a finished mesh material to enhance human-like mobility. Meanwhile, NVIDIA also showcased its new breakthroughs in the field of humanoid task execution by utilizing Apple’s Vision Pro headset to obtain more realistic training data. This innovative approach aims to bridge the simulation gap to the real world to perform various everyday tasks. To do this, NVIDIA recently updated its AI platform for developing humanoid robots called Project Groot.
But the major obstacle in this field has been the scarcity of high-quality training data. Now, NVIDIA’s solution combines human-generated data with synthetic data to create a more robust data set for training robots. In fact, NVIDIA’s head of embodied AI shared further insights into this novel use of the Apple Vision Pro headset, explaining that by wearing the headset, humans can control robots from a first-person perspective to perform complicated and dexterous tasks. The Vision Pro does this by capturing human hand poses and translating these motions to the robot’s hands in real time, creating an immersive experience akin to controlling a robot avatar.
And while teleoperation is still sometimes slow and labor-intensive, the data collected is invaluable. NVIDIA multiplies this data using its Robocasa simulation framework, increasing the data set by a factor of a thousand or more. Additionally, NVIDIA’s MIMIC-GEN system generates new actions based on the original human data, filtering out unsuccessful attempts to ensure quality. This approach is a game-changer because it trades expensive human data collection for GPU-accelerated simulations, eliminating the inherent limitations of teleoperation due to the physical constraints of time for human operators. But the new Groot Synthetic Data Pipeline overcomes these limitations in the digital world to significantly expand the volume and variety of training data available.
But at this year’s SIGGRAPH conference, NVIDIA CEO Jensen Huang discussed what he calls the three-computer problem in robotics, where he explained that developing robotics AI requires separate computers for creating the AI, simulating it, and then running it in the actual robot. This multi-stage process ensures that AI models are thoroughly designed, tested, and optimized before they are deployed in real-world scenarios. And NVIDIA is advancing robotics with its open-source initiatives too, with Robocasa now being fully open-source and available at robocasa.ai. Furthermore, MIMIC-GEN is also open-source and currently supporting robotic arms, with future versions in development for humanoids and five-fingered hands as well.
But that’s just the beginning, because to further accelerate humanoid advances, NVIDIA has made three powerful computing platforms available, NVIDIA AI Supercomputers for training models, NVIDIA ISOC-SIM built on Omniverse for skill refinement in simulated environments, and NVIDIA Jetson-Thor Robot Computers for running these models, allowing developers to utilize any or all of these platforms based on their specific project needs. On top of this, NVIDIA has also introduced the Humanoid Robot Developer Program to grant developers early access to the latest advancements around NVIDIA ISOC-SIM, NVIDIA ISOC Lab, Jetson-Thor, and Project Groot General Purpose Humanoid Foundation models.
Plus, prominent organizations like Boston Dynamics, ByteDance Research, Field AI, Figure, Phoria, Limex Dynamics, Neuro Robotics, Robot Era, and Skilled AI have joined the Early Access Program. This initiative allows them to leverage NVIDIA’s state-of-the-art technology to push the boundaries of robotics. And it’s all accessible to developers via NVIDIA Osmo, ISOC Lab, and soon NVIDIA NIMS Microservices, ensuring they have the industry’s leading tools at their disposal. And finally, NVIDIA’s biggest breakthrough could be its new Universal Scene Description, also known as OpenUSD, which now incorporates Generative AI and NIM Microservices.
Incredibly, NVIDIA’s Generative AI for OpenUSD, the first of its kind, understands OpenUSD-based language, geometry, materials, physics, and spaces. And with the combination of OpenUSD and NVIDIA Accelerated Development frameworks on the Omniverse platform, anyone can now visualize and simulate almost whatever it is that they can imagine to be used in NVIDIA simulations, and then to the real world. These new NVIDIA Microservices feature AI models that generate OpenUSD language, answer queries, create OpenUSD Python code, apply materials to 3D objects, and comprehend 3D space and physics. These tools expedite digital twin development, robotics, industrial design, and many other simulation formats.
Plus, developer tools enable streaming of large, RTX ray-traced datasets to Apple Vision Pro. NVIDIA’s Generative AI models for OpenUSD also integrate AI co-pilots and agents into the USD workflows, enhancing capabilities in 3D worlds and promoting USD adoption across manufacturing, automotive, and robotic sectors. Specifically, the preview microservices include USD Code Nim, which generates OpenUSD Python code from text prompts and answers general OpenUSD questions, USD Search Nim, which searches through large libraries of OpenUSD, 3D, and image data using natural language or image inputs, and USD Validate Nim, which checks file compatibility with OpenUSD versions and generates RTX-rendered, path-traced images.
Upcoming microservices include USD Layout Nim, which creates OpenUSD-based scenes from text prompts using spatial intelligence, USD Smart Material Nim, which applies realistic materials to CAD objects, FVDB Mesh Generation Nim, which converts point cloud data to OpenUSD Mesh, FVDB Physics Super Res Nim, which enhances frame resolution for high-quality physics simulations, and FVDB NURF XL, which generates large-scale neural radiance fields in OpenUSD. [tr:trw].