Machine Energy Report
Editor: Wu Xin
The domestic version of the humanoid robot teamed up with a large model to complete the operation task of complex flexible materials such as folding clothes for the first time.
With the unveiling of Figure 01, which incorporates the OpenAI multi-modal large model, the related progress of domestic peers has been attracting attention.
Just yesterday, UBTECH, the "first humanoid robot stock in China", released the first demo of the humanoid robot Walker S after it was deeply integrated with Baidu Wenxin's large model, showing some interesting new features.
Now, Walker S, blessed by Baidu Wenxin’s large model capabilities, looks like this.
Like Figure 01, Walker S does not move around, but stands behind a desk to complete a series of tasks. It can follow human commands and fold clothes.
After completing the task, you can still chat with it. For example, what should I wear with this black top? The robot still remembers that you are going on a business trip, and it is recommended to match it with dark pants, which is more suitable for formal occasions.
It will also place various switches on the table into plates.
Even if it is disturbed, such as the placed switch is thrown back on the table, or the socket that is about to be reached is removed again, Walker S can adjust the working status in real time and complete the work according to the new situation. Placement tasks.
In February, Walker S already demonstrated multi-modal perception and motion control capabilities during practical training at a new energy vehicle factory.
This time, through in-depth integration with the Wenxin large model, Walker S’s cognitive and control capabilities have reached a new level. It not only gained advanced intention understanding and fine-grained task planning capabilities, but also completed folding clothes for the first time. Such complex flexible material manipulation tasks.
Wenxin large model is Wenxin's industrial-level knowledge enhancement large model, which has cross-modal and cross-language deep semantic understanding and generation capabilities, as well as knowledge reasoning, task planning and other capabilities. By transplanting these capabilities to humanoid robots, the robot can analyze and understand the material, shape, wrinkles and other attributes of clothing like humans, and deduce the best way and sequence of folding clothes based on past experience. During the actual process of folding clothes, the robot will analyze the status changes of the clothes in real time and adjust its action strategy accordingly.
In the object interference sorting task, Walker S also gave full play to the collaborative advantages of the "AI large model robot". First, the spatial positioning and semantic information of the object is obtained through the multi-modal perception model on the device, and then the information is handed over to the large model for intelligent processing. The latter quickly builds Walker S with its excellent task dismantling and logical reasoning capabilities. Find the optimal task planning and execution path. Walker S maps this solution to the actual control of the robotic arm and dexterous hands, and finally successfully completes the entire set of complex tasks.
This move is also the first demonstration of similar capabilities among domestic peers. Its innovative application and implementation difficulty are also among the first echelon in the industry globally. "In many demonstrations, including Figure's cooperation with OpenAI and our cooperation with Baidu, end-to-end can now be achieved." UBTECH management told China Business News at last night's performance review and outlook meeting.
" We use Baidu's large model to disassemble tasks, understand natural language, and sequence logical arrangements. In addition to the multi-modal large model based on the client and side built by the company based on open source model training last year , we believe that in the future, when the competition in the humanoid robot market becomes increasingly fierce, only a strong alliance can achieve 1 1 > 2." When explaining this cooperation, UBTECH management said, "Foreign Tesla has large model capabilities and has The combination of OpenAI, NVIDIA and Figure, etc., we can see that cooperation can provide strong technical support for the implementation of humanoid robots."
However, by comparing OpenAI's videos, we found that the empowered Walker S is still different from Figure 01 There is a gap.
The most obvious thing is the speed of action. In addition, in terms of instruction content, the instructions received by Walker S are usually relatively clear and specific, while Figure 01 can convert more abstract instructions into reasonable and feasible specific operations through common sense reasoning.
In addition, Figure 01 can chat while working (especially explaining his operations), and has short-term memory ability, and can reasonably plan current actions based on the content of previous conversations.
As the competition in generative AI becomes increasingly fierce, and the research focus extends from long text and multi-modality to embodied intelligence, we have reason to believe that future humanoid robots will no longer be limited to perceiving static data, but It is the ability to move freely and interact with the environment in a virtual or even real three-dimensional world. This also marks a major leap in AI from simple machine learning to the execution of complex human-like tasks.
In fact, the humanoid robot track has shown an extremely hot momentum in the past six months, with prototypes at home and abroad frequently unveiled, and startups financing actively. In February, UBTECH exposed a video of Walker S being trialled at NIO's new energy vehicle factory. The robot can smoothly complete seat belt inspection, vehicle logo affixing and other tasks. UBTECH's share price also surged 200% in two days in early March.
However, the global humanoid robot is still in the pilot stage, and it will still take time to scale up the volume. After all, there is a big difference between demo and actual application, and the latter must comprehensively consider a series of factors such as reliability, stability, and cost. UBTECH stated that the combination of large AI models and humanoid robots will greatly improve the latter's intelligence level and adaptability to multi-scenario tasks, and accelerate its industrialization process. Founder Zhou Jian also publicly stated that he hopes to complete the first batch of humanoid robots in the factory by the end of this year, pass tests, and prepare for the mass outbreak of humanoid robots in 2025. In addition, by the end of this year, UBTECH plans to launch the first-generation home-side emotional companion humanoid robot. The robot will be equipped with a large model and can interact with users and form short-term and long-term memories.
Reference link
https://www.stcn.com/article/detail/1164967.html
THE END
Please contact this public for reprinting Authorized by No.
To submit articles or seek coverage: content@jiqizhixin.com
The above is the detailed content of After 2 months, the humanoid robot Walker S can fold clothes. For more information, please follow other related articles on the PHP Chinese website!