AI News: NVIDIAs Latest AI Breakthrough W/ 3 Modalities Suddenly Woke Up (GR-2 Robot + VX2)

Spread the Truth

Summary

Fourier Intelligence has unveiled a new humanoid robot, GR2, which is 175 cm tall, weighs 63 kg, and has 53 degrees of freedom for human-like movements. The robot has a highly sensitive hand with 12 degrees of freedom and six tactile sensors, allowing it to handle objects with precision. It also has powerful actuators and a dual encoder system for smooth, accurate movements. Additionally, NVIDIA has developed a new AI, Masked Mimic, that can perform complex tasks and explain its actions. It uses a two-stage framework and a variational autoencoder to process inputs and generate actions, even reconstructing motions from partial observations. It can handle difficult terrains, interact naturally with objects, and follow instructions given in natural language.

Transcript

Fourier Intelligence just revealed its newest GR2 humanoid, plus NVIDIA has an AI breakthrough, and AI holograms are here. But first, the GR2 has redesigned its physical specifications, now standing at 175 centimeters tall and weighing just 63 kilograms, with this new design housing a total of 53 degrees of freedom for fluid and human-like movements. And as for strength, the robot’s capabilities are further enhanced by its single-arm load capacity of 3 kilograms, opening up a wide range of potential applications in various scenarios at home or at work. But one of this android’s most notable features is its highly sophisticated dextrous hand with 12 degrees of freedom.

On top of this, the GR2 is equipped with six array-type tactile sensors, giving the robot’s hands an unprecedented level of sensitivity and precision when it comes to real-world object manipulation. This could enable the execution of tasks requiring fine motor skills, such as delicately assembling parts in a factory. But the most important part of the GR2’s performance are likely its fail-safe 2.0 actuators, which are cutting-edge components that deliver peak torques exceeding 380 newton meters for powerful and efficient movement. Furthermore, the incorporation of a dual encoder system effectively doubles the robot’s control accuracy, allowing for even smoother and more precise actions.

And for developers, Fourier Intelligence also introduced the Fourier Toolkit as a comprehensive suite of optimized tools to enable community-driven innovation, with this toolkit supporting a wide array of frameworks, including NVIDIA’s Isaac Lab, Mujoko, and robot operating system, among others. Plus, NVIDIA researchers just revealed their newest AI that not only performs unbelievably complex tasks, but also explains how it works in its own words, like this. And now for the twist, because unlike its predecessors that relied on separate AI models, Masked Mimic instead takes a unified, multimodal approach, but with a special key advantage.

This leg-up is the addition of a dataset with kinematic motion trajectories, textual descriptions, and object interactions, which gives this AI a wide swath of abilities without needing task-specific models or custom reward functions. The secret to this performance partially lies in the system’s architecture being built on a two-stage framework. To start, a full-body motion tracking controller is trained using reinforcement learning, allowing the model to generate actions that guide a physically simulated character through desired motion paths. This controller can navigate variable terrains and interact with objects, laying the foundation for more complex behaviors.

Then, in the second stage, this controller is distilled into a more flexible, partially constrained version with the model being refined to process multimodal inputs, including its movements, incoming text commands, and object interactions. This results in Masked Mimic even being able to reconstruct entire motions from partial observations. Additionally, at the heart of Masked Mimic is a variational autoencoder with a conditional prior. Put simply, it’s a sophisticated architecture that processes inputs from different modalities as separate tokens by mapping partial constraints to a distribution of possible solutions. The result is a model capable of reconstructing various types of motion, even those it wasn’t explicitly trained on.

But Masked Mimic excels in complex motion tracking, too, from cartwheels to full-body movements. So, the challenge of creating natural body movements in VR using limited input data is addressed by generating fluid, natural movements for the entire body, basing it on just the head or hand positions, for instance. Furthermore, the system is able to handle difficult terrains, maintaining consistent character balance and movement styles across various environments. It can adapt to slopes or rocky terrain while preserving the essence of the original motion, showcasing its generalizability in new scenarios. Plus, Masked Mimic also supports text-to-motion control, allowing users to instruct the AI to perform specific tasks through natural language commands.

From balancing on one foot to attempting a handstand, the system can interpret and execute a wide range of instructions. Moreover, it interacts naturally with objects, using high-level control commands and generating appropriate interactions based on object-bounding boxes. And in a holographic breakthrough, Voxon Photonics finally offers true 3D holograms with the VX2, but there’s more. Hi, how are you? I’m doing great, thanks. Just here, ready to help out. How about you? My name is Inby. Nice to meet you, Inby. What’s on your mind? It’s actually Inby. My name is Inby. Got it, Inby. Nice.

Thanks for the correction. What would you like to chat about? Hmm. My job is a clinical research coordinator in oncology. That’s really fascinating work. Clinical research in oncology is super important. What do you enjoy most about it? I think the part that I enjoy the most is interacting with the patients with cancer. Sounds like you find value in the connections to me. Hey, Jeannie, what can you see? I see a person with dark hair, wearing a striped cardigan and a white shirt, smiling, not cute in front of a backdrop, peeping a logo or brand name.

Hey, Jeannie, what do you see now? The image appears to show a digital display or hologram face featuring a person with vibrant green hair. Can you believe that? That’s you. It’s like I’m making my holographic debut. It’s a bit wild, but I don’t care. Um, can we talk in Korean? This is made possible with Voxon’s unique VLED technology, which organises millions of points of light floating in 3D space to power the hologram. But the magic happens by combining graphics processing, high-speed LED matrix rendering and a clever exploitation of what’s known as the persistence of vision phenomenon.

And the applications for this technology are practically universal. For instance, in the medical field, the VX2 is already being used to visualise complex anatomical structures, allowing surgeons to plan procedures with unprecedented precision. This provides a highly detailed 3D view of a patient’s organs floating in front of the surgeon with the ability to manipulate the image in real time. And in the realm of scientific research, the VX2 is also proving its worth by using it to visualise complex astronomical data, giving researchers the ability to literally walk around a 3D model of a galaxy cluster to transform their understanding of cosmic structures.

And in gaming, companies are scrambling to develop titles that take advantage of the VX2’s unique capabilities. This extends past graphics towards a complete redesign of the gaming experience from the ground up, where games could exist in real 3D space. But education might be transformed by these holograms, as students can visually engage with complex concepts in ways that would have previously been impossible, like exploring molecular structures or dissecting virtual frogs. But despite its impressive capabilities, the VX2 is not without challenges, as content creation for the platform requires a bit of optimisation, and some applications may benefit from larger versions of the device.

However, Voxon has made the system developer-friendly with software development kits and APIs to encourage innovation. And finally, researchers have created a reversible robotic hand that addresses long-standing challenges in object manipulation and robot mobility by allowing the hand to grasp objects from both sides, plus detach its hand. Then, it can crawl like a spider to reach items outside its typical workspace. Furthermore, the four fingers have four degrees of freedom each, with researchers using genetic algorithms and comprehensive grasp taxonomies to determine the optimal number and placement of fingers for both grasping and crawling capabilities.

Plus, simulations using Mujoko helped refine the hand’s locomotion, which was then successfully tested in real-world scenarios, where the hand demonstrated its ability to crawl while even carrying objects, showcasing a seamless transition. In practical tests, the hand proved capable of grasping multiple objects using both sides independently. When faced with out-of-reach items, it detached from the arm, crawled forward to grasp the object, and returned to reconnect. This unique robotic hand design could transform dexterous tasks in multiple industries, but there’s still a lot of fine-tuning to do as development continues. So
[tr:trw].

Spread the Truth

AI News

AI News: NVIDIAs Latest AI Breakthrough w/ 3 Modalities Suddenly Woke up (GR-2 Robot + VX2)

Summary

Transcript

Leave a Reply Cancel reply

No Fake News, No Clickbait, Just Truth!

Subscribe to our free newsletter for high-quality, balanced reporting right in your inbox.

Subscribe Free Now Below!

No Fake News, No Clickbait, Just Truth!

Subscribe to our free newsletter for high-quality, balanced reporting right in your inbox.

Subscribe Free Now Below!

Summary

Transcript

Leave a Reply Cancel reply