Summary
➡ A new system called Bidex has been developed to train robots to replicate human dexterity. It combines motion-capture gloves and teacher arms, making it a dexterous, low-cost, and portable solution for training robots. The system costs $7,000, with an additional $5,000 for the robot hands.
➡ Google DeepMind has introduced Cat4D, a new technology that can transform 2D video inputs into dynamic 4D scenes. This technology can be used in video production, gaming, virtual reality, and interactive media. It can also separate camera motion from scene dynamics, allowing for more immersive exploration of scenes.
➡ Luma has upgraded its Dream Machine AI video model, making it a full-fledged creative platform. Users can describe their ideas in natural language or upload reference images to guide the platform’s output. The platform also offers features for personalization and collaboration, and it can generate consistent characters from a single image.
➡ OpenAI’s Sora video generation model has been leaked online by a group of artists who participated in its alpha testing. The artists claim that OpenAI used their feedback to refine Sora without offering meaningful compensation. The leaked model allows users to generate 10-second 1080p video clips by tapping directly into OpenAI’s API. The group insists they’re not opposed to AI-generated art, but they want fair compensation and ethical practices.
Transcript
And because Apollo is designed with modularity and adaptability in mind, it even allows for the robot to be configured for both stationary tasks as well as fully mobile tasks that require its legs. And to power it all, the robot’s 4-hour runtime is supported by hot-swappable battery-pack technology to remain operational with minimal downtime, while for longer tasks, Apollo can also be tethered or plugged in for continuous operation. But safety remains a priority, as the robot’s advanced force control and impact detection system ensure safe interactions with human co-workers, plus Apollo’s LED displays located on its head, chest and mouth, all provide status updates for more intuitive communication.
But the heart of it all is the Apptronic software suite, which enhances Apollo’s functionality for point-and-click control to execute tasks involving payloads of up to 55 pounds. It even enables customizable behavior through features like the perimeter zone, which act as an outer safety zone where Apollo’s behavior can be adjusted dynamically when an object is detected. And as for the future, Apptronic envisions broader applications too, including construction, retail, elder care, and much more, with the robot becoming more general purpose over time as it learns, costing a reported $50,000 currently. But training these robots to replicate human dexterity has always required vast amounts of manually collected teleoperation data, until now, as researchers just unveiled Bidex, a new bi-manual teleoperation system designed to address this challenge.
It works by combining motion-capture gloves and teacher arms, positioning Bidex as an extremely dexterous, low-cost, low-latency, and portable solution for training robots with over 50 degrees of freedom. In fact, when compared to existing teleoperation systems like Vision Pro and SteamVR, Bidex excels at handling complex tasks and produces higher-quality data at faster rates. Plus, it has even been demonstrated operating a mobile bi-manual robot in real-world environments, showcasing its capabilities for tasks requiring precision and adaptability. As for its cost, it’s $7,000 for the teleoperation system and $5,000 for the robot hands, while Bidex is compatible with various robot arms, with its reproducibility making it a promising tool for advancing robotic learning and general-purpose policy training moving forward.
But there’s another AI breakthrough you can use today for free on your browser. As Google DeepMind just introduced Cat4D, a new breakthrough that transforms 2D video inputs into dynamic 4D scenes, and its secret lies in leveraging a cutting-edge multi-view video diffusion model, allowing Cat4D to generate multi-view videos from new perspectives as well as reconstruct them into highly detailed, dynamic 3D environments. In fact, this breakthrough finally opens up new possibilities in video production, gaming, virtual reality, and interactive media. But another breakthrough feature is Cat4D’s ability to disentangle camera motion from scene dynamics.
By analyzing a simple monocular video input, Cat4D accurately creates multi-view video sequences while treating the dynamic 3D content as deforming 3D Gaussians. Importantly, these sequences allow for a more immersive exploration of scenes, offering outputs that represent fixed viewpoints over time, varying viewpoints at a single moment, or fully dynamic motion across both time and space. And to showcase its capabilities, DeepMind has provided an interactive browser-based viewer, enabling users to explore 4D scenes in real time, highlighting the potential applications of Cat4D in industries like film, virtual reality, and beyond. And in the race to dominate AI video creation, Luma just released a huge upgrade to its Dream Machine AI video model, transforming it into a full-fledged creative platform, but with a twist.
Instead of relying on precise technical prompts, users can simply describe their ideas in natural language or upload reference images to guide the platform’s output, making Dream Machine accessible to anyone. And for even greater personalization, the platform offers features like multi-image prompting and single-image character references, allowing users to fine-tune their creations with incredible detail, from specific textures and colors, to consistent character designs. Plus, Dream Machine also just introduced different modes to enhance the creative process, such as the new brainstorm mode that allows users to experiment with different stylistic influences for their imagery and videos, while boards enable teams to collaborate on multiple images and videos in a shared space.
Additionally, concept pills provide pre-designed stylistic presets, making it easier to apply unified visual aesthetics to projects. But one of the platform’s standout features is its ability to generate consistent characters from a single image, finally allowing users to animate entire storylines with the same character, with the heart of Dream Machine’s evolution being its new image generation foundation model named Luma Photon, which is capable of producing high-quality still images from text prompts and features embedded text generation. Amazingly, Photon is eight times faster and more cost-efficient than comparable systems. Plus, developers can access Photon’s capabilities through the Luma API, which supports text-to-image, text-to-video, and image-to-video transformations, with Luma prioritizing user privacy for scalable integration into more products soon.
Meanwhile, OpenAI’s cutting-edge Sora video generation model has been leaked online by a coalition of artists who participated in its alpha testing. These artists, who describe themselves as early testers, red teamers, and creative partners, alleged that OpenAI relied on their feedback to refine Sora without offering meaningful compensation. OpenAI, in contrast, asserts that participation was voluntary and came with benefits such as free access, grants, and event invitations, with artists also being bound by confidentiality agreements. Moreover, this group of testers also took issue with OpenAI’s requirement to approve created videos before they could be shared publicly.
So in a dramatic protest, these artists went ahead and posted a version of Sora on HuggingFace, an AI model-sharing platform, under the username PRPuppets, with the leaked model allowing users to generate 10-second 1080p video clips by tapping directly into OpenAI’s API. Furthermore, users analyzed the leaked code and found references to Sora’s Turbo Variant, a faster and more capable version of the model, suggesting that Sora includes various features such as preset styles like natural mode and advanced composition tools. Plus, the leak also hints at several undisclosed capabilities, but despite the leak, the group insists they’re not opposed to AI-generated art, stating they just want fair compensation and ethical practices.
They also cited open-source alternatives like Kog Video X and Mochi One as potential options for artists seeking tools without oversight. However, they acknowledged these tools often require technical expertise that many artists lack, and while OpenAI has yet to comment directly on the leak, it maintains that its artist program was designed to support voluntary participation. [tr:trw].