Summary
Transcript
In fact, the forearm houses essential hardware for hand movement which also includes springs and pivots, with these components contributing to the robot’s stability and precision, though they haven’t quite reached human-level dexterity just yet, due to several engineering challenges. Nevertheless, the hands of the Optimus Gen 3 are already proving to be fully capable of performing various manufacturing tasks inside of Tesla’s factories, plus the robot is expected to play an increasingly significant role in automating production processes in 2025, according to the company. And looking into the future, with Tesla’s plans to begin selling these robots next year, the price is expected to fall somewhere between $20,000 to $30,000, with this release marking a pivotal step in the timeline of automating everyday tasks in the real world.
And in Japan, another robotics company named Melton also presented extremely impressive levels of robotic dexterity with the Meltoned Alpha and its successor, Meltoned Beta. Known for its cable-driven operating mechanisms, Meltoned Alpha showcased the company’s core tech, but the Meltoned Beta builds on this foundation by incorporating practical enhancements for demanding environments, including protection against dust and sparks. Other key improvements to Meltoned Beta include increased mobility, enhanced haptics for realistic sensations and a greater gripping strength. Plus, the model features omnidirectional movement and improved operability through VR interfaces, resulting in a robotic hand that’s even capable of performing complex tool-based tasks.
And it’s already doing human jobs, with the first application of Meltoned Beta being to address labor shortages in the construction industry. In fact, the company says its ability to operate from safe environments lowers entry barriers and attracts new recruits. Additionally, elderly specialists can operate these robots, allowing them to continue contributing their expertise and training new workers. Moreover, Meltoned Beta’s avatar switching capability even optimizes for resource usage, enabling operators to work across multiple sites with more flexibility and productivity than ever before. And AI isn’t just doing robot tasks, but desktop tasks too, as OpenAI just revealed its plans to unveil a new AI assistant called Operator in January of 2025, with this project initially launching as a research preview with an API for developers.
Incredibly, Operator is being designed as a general-purpose computer assistant, but its primary focus will be centered around browser-based tasks for now, which is an objective that aligns with the AI industry’s broader efforts to automate increasingly complex workflows more efficiently. In fact, OpenAI’s CEO Sam Altman has emphasized that AI agents represent the next major phase of AI growth, with this strategic shift arising from slower advancements in traditional language model development. But the development of such AI assistants is a growing trend across major AI labs, with Anthropic having already launched an assistant capable of processing screen content and executing real-time actions.
To match, Microsoft also integrated similar automation features into its co-pilot platform, and Google is currently developing Project Jarvis, which is a Chrome-based AI assistant set to launch with its new Gemini language model at the end of 2024. But while there’s still no standard definition for agentic AI systems, they basically operate as smaller programs or prompts that manage individual subtasks. Then, these systems coordinate with other assistants, either within a single language model or across different AI systems, with their goal being to automate entire workflows through multi-party collaboration. And in a move towards realizing this future, OpenAI has released Project Swarm on GitHub, a powerful open-source framework that enables developers to create and manage systems of multiple assistants.
This illustrates how assistants can transfer control and execute task steps using specific tools, plus it demos the practical application of OpenAI’s assistant concept. And by automating routine tasks for humans, these agentic AI systems are about to reshape how work is conducted across various sectors, as AI agents become more integrated into daily operations, and the potential for increased efficiency and innovation grows. And in another groundbreaking leap for robotics, researchers at TU Wien have developed a wash basin cleaning robot that mimics natural human movements by adapting flexibly to various shapes and surfaces.
This is because traditionally, programming robots for tasks like cleaning uneven surfaces had involved laborious coding of every motion. But TU Wien’s team took a different path, and with shocking results. By observing a human using a sensor-equipped sponge, the robot learned the nuances of cleaning through imitation. This imitation learning approach allows the robot to adjust its technique based on the basin’s contours, applying the right amount of force at the right angle. And while capturing the geometric shape is simple, the actual challenge lies in teaching the robot the appropriate movements and pressures for each scenario.
But imitation learning AI isn’t just limited to cleaning, as it also holds potential for any other surface treatment task, including sanding and polishing. In fact, the robot’s learning is powered by a neural network trained on motion primitives, enabling it to apply learned techniques to new scenarios too. Plus, the vision extends to a network of workshop robots sharing their experiences, a concept known as federated learning, where each robot gains unique insights locally but shares foundational knowledge globally, enhancing their collective capabilities. As a result, TU Wien’s Washbasin cleaning robot not only won the best application paper award at IROS this year, but also serves as an example of the intelligent automation revolution that’s fast approaching.
And finally, Microsoft Research demoed its magenta one, an AI system using multiple specialized agents for complex tasks, which in this case completed a real world transaction on the internet. It worked by having a main coordinator oversee task planning, progress tracking, and problem solving. Next, four agents handle web browsing, file management, code writing, and execution, with this modular approach allowing for easy updates and a reduction of resource needs. However, tests revealed inefficiencies and unexpected behaviors, like attempting unauthorized human interactions. But nonetheless, Microsoft is pushing ahead in developing AI for natural language computer usage.
Anyways, like and subscribe and check out these bonus clips. Thanks for watching!
[tr:trw].