Summary
Transcript
This hierarchical control system enables the robot to understand and respond to complex service scenarios while continuously learning and improving its operations. But the D7’s flexibility and speed specifications are equally impressive. Powered by a battery exceeding one kilowatt-hour, it can operate continuously for over eight hours. Its 360-degree omnidirectional movement and maximum speed of two meters per second allow for efficient navigation even on slopes up to 10 degrees. The bionic arms can lift 10 kilograms with an end-point precision of 0.1 millimeters. Additionally, PUDU robotics strategy extends beyond the D7, encompassing a vision for the future of service robotics.
The company is developing three distinct categories of robots, specialized robots for specific tasks, semi-humanoid robots for adaptable applications, and fully humanoid robots for complex human interactions. This approach aims to address various needs within the service robotics sector, enhancing operational efficiency and customer experience across multiple industries. And as for data collection, the robot leverages anonymized data collection and ISO plus IEC certifications to ensure trust and compliance. And maybe powering these humanoids soon, META just announced its newest Llama 3.2 with two high-capacity AI models as their very first to support image-based reasoning, plus a sneak peek of AI-powered prototype glasses.
But first, these new Llama 3.2 models have received a massive vision understanding upgrade, giving them a new level of access to all kinds of visual data, including charts, graphs, and images. This finally allows for applications like image captioning to describe scenes and extract a new range of details. And aside from its high-capacity models, Llama 3.2 also introduced its lightweight 1B and 3B models for on-device applications, with these smaller models handling tasks like multilingual text generation and tool calling, while still being privacy-focused, so that data remains on the user’s device.
The local processing of these models offers near instantaneous responses, and META reports that the Llama 3.2 vision models perform competitively against leading foundation models like Claude III Haiku and GPT40 Mini in image recognition and visual understanding tasks. The 3B model is said to outperform comparable models like Google’s Gemma 2 2.6B and Fi 3.5 Mini in various language tasks, while the 1B model remains competitive in its class. Now for the twist, as Llama 3.2’s vision capabilities involved a novel approach to integrating image processing into the existing language model architecture, META’s team employed adapter weights to connect a pre-trained image encoder with the language model, using cross-attention layers to align image and language representations.
This method allowed the preservation of text-only capabilities while adding visual reasoning skills, and for the lightweight models, META utilised pruning and knowledge distillation techniques to create efficient, smaller versions that retain much of the performance of larger models. Moving forward, the company says they’re focused on making 3.2 accessible to developers and researchers, working directly with major mobile chip manufacturers like Qualcomm, Mediatek and ARM to ensure compatibility across devices. Additionally, META has developed reference implementations for inference, tool use, and retrieval augmented generation, along with introducing the Llama Stack distribution for simplified deployment across various environments.
And that’s only the beginning, as META also just revealed its AI-powered prototype device named Orion as a pair of ultra-light augmented reality glasses to compete with Apple Pro. Weighing less than 100 grams, the Orion headset is wireless with a wide field of view at 70 degrees, with META claiming this is currently the largest field of view available in a glasses-like form factor. But Orion’s core technology operates with 10 customised silicon chips, designed to efficiently handle AI-driven tasks such as hand and eye tracking, simultaneous localisation and mapping, plus graphics processing. Furthermore, the Orion’s display takes a novel approach by using miniature projectors in the frame to shoot light into waveguides embedded with nanoscale 3D structures.
These structures diffract light to create holographic images at various depths, allowing digital content to blend with the real world. Furthermore, META says it’s using innovative materials in Orion’s construction. The lenses are made of silicon carbide chosen for its optical properties and high refractive index. The frames are constructed from magnesium, a material known for its heat dissipation capabilities. In terms of user interaction, Orion features voice control and hand and eye tracking for navigation. Additionally, META has developed an electromyography wristband that detects neuromuscular signals, allowing for subtle hand gesture controls. Despite these advancements though, Orion is still just a prototype, with META working to reduce the overall costs and scale its manufacturing processes for a release in the coming years.
And finally, Tencent Robotics X-Lab has introduced its latest innovation, which it calls The Five, a cutting-edge residential robot designed to revolutionise human-robot coexistence. In fact, this general-purpose robot presents an interesting leap forward in terms of domestic robotics design. The Five boasts an array of advanced features, seamlessly integrating core capabilities from its predecessors. Its most striking attribute is the novel hybrid design, combining four straight legs with wheels. This ingenious configuration allows The Five to navigate diverse terrains with unprecedented versatility, maintaining the obstacle-crossing prowess of legged robots, while harnessing the efficiency of wheeled locomotion.
Extensive lab testing has demonstrated The Five’s capabilities. The robot exhibits an impressive degree of adaptability to its environments, being able to travel through or even move objects through typical household environments. While the robot’s design equips it to handle complex tasks and engage in natural interactions with humans, one of The Five’s standout features is its expansive tactile skin, which enhances its sensitivity to the surrounding environment. This advancement, coupled with multi-fingered dexterous hands, enables the robot to perform intricate manipulations with precision and care, addressing a whole new level of dexterity in tasks among home robots.
And its designers says safety remains a top priority in The Five’s design, as Tencent Robotics X-Lab has implemented advanced algorithms and sensors to ensure safe physical interactions between humans and the robot, paving the way for seamless integration into daily life. Overall, Tencent still has much work ahead to perfect a general-purpose Android for home use. But as the competition heats up around the world, prices for Androids will continue to come down, allowing for personal robots to replace personal computers around 2030. Anyways, make sure to like and subscribe to AI News to keep up with the latest developments in the space.
And thanks for watching. [tr:trw].