Written by 5:15 pm AI, Discussions, Generative AI

**Equipping AI Chatbots with Limbs: Overcoming the Challenge**

Robotics startup Covariant is experimenting with a ChatGPT-style chatbot that can control a robotic…

In the presence of a robotic program reminiscent of ChatGPT, Peter Chen, the CEO of Covariant, a machine software company, sits. He requests, “Display the backpack in front of you.” In response, a mechanical finger emerges, hovering over a container filled with various items, such as an apple, a pair of socks, a packet of chips, and another pair of socks.

This interactive chatbot possesses the capability to inspect and manipulate the objects within its view. Demonstrating this, the arm gently picks up the fruit, prompting WIRED to suggest that Chen instruct it to transfer the item to a nearby bin.

This hands-on chatbot represents a significant stride towards showcasing the public’s adaptable functionalities akin to those exhibited by programs like ChatGPT. There is an optimistic outlook that AI will streamline the programming and execution of numerous tasks by robots in the future.

Chen emphasizes that foundational versions are pivotal for the future of automation, referring to them as large-scale, versatile machine-learning models tailored for specific domains. According to Chen, the practical robot he showcased operates on a Covariant model named RFM-1, denoting Robot Foundation Model. This model has undergone extensive training on textual data similar to that utilized for creating ChatGPT, Google’s Gemini, and other chatbots. Additionally, it has been enriched with motion data from millions of real-world robot actions.

By integrating this supplementary data, a model proficient in both linguistic and practical domains can be achieved, bridging the gap between theory and application. RFM-1 exhibits the ability to manipulate a robot arm and generate videos depicting various robotic tasks. Chen highlights the versatility of RFM-1 in outputting diverse modalities that cater to the multifaceted realm of robotics, describing it as “mind-blowing.”

Moreover, the concept demonstrates that the same technology can operate without relying solely on its training data. This advancement could potentially enable a basic model to evolve into a humanoid robot with further training, as suggested by Pieter Abbeel, Covariant’s director and chief scientist, renowned for his contributions to machine learning. Abbeel’s background includes work at OpenAI, where he spearheaded a project in 2010 that trained a robot to fold towels gradually.

Established in 2017, Covariant currently offers software leveraging machine learning to empower robot arms in warehouse operations, albeit with specialization in specific tasks. Abeel envisions that RFM-1 models could enhance computers’ adaptability to novel tasks significantly. Drawing a comparison to Tesla’s approach in training self-driving systems, Abeel underscores the parallel progress in their respective fields.

Contrary to the optimism surrounding the transformative potential of large-scale language models like ChatGPT in robotics, Abbeel and his team at Covariant are not alone in their reservations. Initial outcomes from projects such as RFM-1 have shown promise, yet the extent of data required to equip drones with broader capabilities remains uncertain.

The challenge lies in the scarcity of accessible data compared to text, images, or videos on the internet, as noted by Pulkit Agrawal, an AI and robotics researcher at MIT. Efforts are underway to generate data for training computers, including collecting data from robot-based simulations or human task demonstrations in videos.

Google DeepMind, a prominent player in AI research, has been actively developing unique AI models for robots, such as RT-2 and RT-X, to enhance machine learning capabilities. Agrawal acknowledges the value of Covariant’s extensive dataset from customer deployments in machine arm operations, albeit its current suitability for specific tasks within warehouse management.

Covariant’s endeavors are commendable for enhancing AI models’ comprehension of real-world physics. Abbeel highlights that RFM-1 exhibits a superior grasp of real-world constraints compared to OpenAI’s Sora, emphasizing its competence in understanding human anatomy and basic physics, albeit with room for improvement.

Visited 1 times, 1 visit(s) today
Tags: , , Last modified: March 11, 2024
Close Search Window
Close