Summary
Transcript
In fact, by utilizing a proprietary harmonic joint module combined with advanced reinforcement and imitation learning techniques, Engine AI is setting a new standard in robotic mobility. Plus, the robot’s end-to-end neural network models enhance its task execution and by using an x86 architecture with hardware including the Intel N97 plus Nvidia’s Jetson Oren, the PM01 offers high-performance dual-chip architecture with enhanced visual perception thanks to its built-in depth camera. As for power, the robot packs a 10,000 milliamp hour battery that can be quick released and then swapped out. As for mobility, its legs pack six degrees of freedom in total, with three in the hip, one in the knee and two in the ankles, giving the robot a walking speed of up to two meters per second while operating for approximately two hours on a single charge.
Plus, the robot’s full aluminum exoskeleton is built to endure tough environments with integrated sealing and heat dissipation technology, ensuring consistent performance. The robot even features a screen on its chest and surround sound 3D speakers plus a multi-array microphone for maximum interactivity. And at just $12,000, Engine AI is aiming for affordability in both home and work environments, with a dedicated team of over 30 researchers developing using Nvidia’s Isaac Jim to simulate extensive virtual testing before its real-world deployment. But Leju Robotics has also just unveiled another brand new humanoid robot designed to excel in intelligence and adaptability.
Weighing approximately 45 kilograms and featuring 26 degrees of freedom across its body, the Quavo robot boasts a maximum walking speed of 4.6 kilometers per hour, with the ability to perform continuous jumps with a height of over 20 centimeters, as well as navigate various terrains including sand, grass, and uneven landscapes. And at its core, the robot utilizes self-developed integrated joints that are capable of delivering peak torque exceeding 300 newton meters, with these high torque density joints being engineered for dynamic, high-precision tasks to enable more demanding robot movements while maintaining the maximum amount of control and power.
On top of this, the robot is also equipped with advanced multimodal perception systems, including a depth camera for 360-degree visual awareness. Additionally, it supports secondary development with customizable Kai Hong sensors, making it highly adaptable for specialized applications. And as for embodied intelligence, the robot is powered by Huawei’s Pangu Embodied Intelligence Large Language Model, significantly enhancing its cognitive abilities and generalization skills. In fact, this AI framework allows the robot to perform complex tasks such as object recognition, interactive question and answering, high-fiving, handing over items, and executing multi-step task plans with over 10 sequential actions.
Plus, the Pangu model can even generate training simulations, accelerating the robot’s learning curve for challenging real-world scenarios. Because of this, Leju Robotics is currently targeting both industrial and domestic applications with its Quavo robot. For example, in household settings, the robot can do tasks like washing, watering plants, arranging flowers, and drying clothes. Meanwhile, in industrial environments, the robot has been tested for operations such as electrical inspections and tin dipping, with future applications being expected to include roles in fields like exhibition hall guidance and customer interaction. But Leju’s huge advantage lies in Huawei’s provision of the robot’s advanced software solutions, including its Pangu AI model and the Hongmen operating system, with Leju mainly focusing on hardware innovation instead.
And all the while, OpenAI may be about to develop its own robots, with the timeline for this potential project and its intended use cases having yet to be clarified, leaving many to speculate about OpenAI’s long-term ambitions in the space. Nevertheless, humanoid robots using large language models have demonstrated immense potential across a variety of industries, from optimizing warehouse logistics and enhancing industrial workflows to providing in-home assistance, with the humanoid robot market being projected to grow to over $66 billion by the year 2032. In fact, the company already shut down its robotics division in 2021, but earlier this year, OpenAI revived its robotics research group after this three-year pause after making multiple periphery investments in its competitors.
Interestingly, OpenAI itself has invested strategically in robotics companies like FigureAI and 1X. However, other sources suggest that humanoid robots aren’t currently a top priority for OpenAI compared to its ongoing work in developing cutting-edge AI models, where recently OpenAI launched its newest AI models called O3 and O3 Mini, which are both designed to excel in complex reasoning tasks. But OpenAI’s O3 might soon be outshined by a group of AI researchers from Alibaba’s Quinn, who’ve just unveiled their QVQ72B preview as an experimental open-source model that’s pushing the boundaries in visual reasoning. Designed to analyze images and solve problems step-by-step, but this model doesn’t just process visual inputs.
Instead, it thinks through them, reflecting on instructions and delivering answers with confidence scores, a hallmark of advanced reasoning systems. Furthermore, the QVQ72B preview builds on Quinn’s existing 72B vision language model but now introduces enhanced reasoning capabilities, with Quinn claiming this is the first open-source model of its kind, though the team hasn’t yet explained how it relates to their recently released QWQ reasoning model. Nevertheless, to test its capabilities, Quinn’s AI researchers subjected their model to four benchmarks – MMMU for college-level visual understanding, MathVista for reasoning through mathematical graphs, MathVision for tackling competition-level math problems, and Olympiad Bench for Olympic-level math and physics challenges in both Chinese and English.
As for results, QVQ outperformed its predecessor across the board while achieving accuracy levels on par with leading closed-source models from OpenAI and Anthropik. Despite these achievements, though, QVQ72B preview is not without its flaws, as it reportedly still sometimes switches languages unexpectedly or else gets stuck in circular reasoning loops. Anyways, like and subscribe and check out these bonus clips. Thanks for watching!
[tr:trw].