After 2 months, the humanoid robot Walker S can fold clothes
Machine Energy Report
Editor: Wu Xin
The domestic version of the humanoid robot teamed up with a large model to complete the operation task of complex flexible materials such as folding clothes for the first time.
With the unveiling of Figure 01, which incorporates the OpenAI multi-modal large model, the related progress of domestic peers has been attracting attention.
Just yesterday, UBTECH, the "first humanoid robot stock in China", released the first demo of the humanoid robot Walker S after it was deeply integrated with Baidu Wenxin's large model, showing some interesting new features.
Now, Walker S, blessed by Baidu Wenxin’s large model capabilities, looks like this.
Like Figure 01, Walker S does not move around, but stands behind a desk to complete a series of tasks. It can follow human commands and fold clothes.
After completing the task, you can still chat with it. For example, what should I wear with this black top? The robot still remembers that you are going on a business trip, and it is recommended to match it with dark pants, which is more suitable for formal occasions.
It will also place various switches on the table into plates.
Even if it is disturbed, such as the placed switch is thrown back on the table, or the socket that is about to be reached is removed again, Walker S can adjust the working status in real time and complete the work according to the new situation. Placement tasks.
In February, Walker S already demonstrated multi-modal perception and motion control capabilities during practical training at a new energy vehicle factory.
This time, through in-depth integration with the Wenxin large model, Walker S’s cognitive and control capabilities have reached a new level. It not only gained advanced intention understanding and fine-grained task planning capabilities, but also completed folding clothes for the first time. Such complex flexible material manipulation tasks.
Wenxin large model is Wenxin's industrial-level knowledge enhancement large model, which has cross-modal and cross-language deep semantic understanding and generation capabilities, as well as knowledge reasoning, task planning and other capabilities. By transplanting these capabilities to humanoid robots, the robot can analyze and understand the material, shape, wrinkles and other attributes of clothing like humans, and deduce the best way and sequence of folding clothes based on past experience. During the actual process of folding clothes, the robot will analyze the status changes of the clothes in real time and adjust its action strategy accordingly.
In the object interference sorting task, Walker S also gave full play to the collaborative advantages of the "AI large model robot". First, the spatial positioning and semantic information of the object is obtained through the multi-modal perception model on the device, and then the information is handed over to the large model for intelligent processing. The latter quickly builds Walker S with its excellent task dismantling and logical reasoning capabilities. Find the optimal task planning and execution path. Walker S maps this solution to the actual control of the robotic arm and dexterous hands, and finally successfully completes the entire set of complex tasks.
This move is also the first demonstration of similar capabilities among domestic peers. Its innovative application and implementation difficulty are also among the first echelon in the industry globally. "In many demonstrations, including Figure's cooperation with OpenAI and our cooperation with Baidu, end-to-end can now be achieved." UBTECH management told China Business News at last night's performance review and outlook meeting.
" We use Baidu's large model to disassemble tasks, understand natural language, and sequence logical arrangements. In addition to the multi-modal large model based on the client and side built by the company based on open source model training last year , we believe that in the future, when the competition in the humanoid robot market becomes increasingly fierce, only a strong alliance can achieve 1 1 > 2." When explaining this cooperation, UBTECH management said, "Foreign Tesla has large model capabilities and has The combination of OpenAI, NVIDIA and Figure, etc., we can see that cooperation can provide strong technical support for the implementation of humanoid robots."
However, by comparing OpenAI's videos, we found that the empowered Walker S is still different from Figure 01 There is a gap.
The most obvious thing is the speed of action. In addition, in terms of instruction content, the instructions received by Walker S are usually relatively clear and specific, while Figure 01 can convert more abstract instructions into reasonable and feasible specific operations through common sense reasoning.
In addition, Figure 01 can chat while working (especially explaining his operations), and has short-term memory ability, and can reasonably plan current actions based on the content of previous conversations.
As the competition in generative AI becomes increasingly fierce, and the research focus extends from long text and multi-modality to embodied intelligence, we have reason to believe that future humanoid robots will no longer be limited to perceiving static data, but It is the ability to move freely and interact with the environment in a virtual or even real three-dimensional world. This also marks a major leap in AI from simple machine learning to the execution of complex human-like tasks.
In fact, the humanoid robot track has shown an extremely hot momentum in the past six months, with prototypes at home and abroad frequently unveiled, and startups financing actively. In February, UBTECH exposed a video of Walker S being trialled at NIO's new energy vehicle factory. The robot can smoothly complete seat belt inspection, vehicle logo affixing and other tasks. UBTECH's share price also surged 200% in two days in early March.
However, the global humanoid robot is still in the pilot stage, and it will still take time to scale up the volume. After all, there is a big difference between demo and actual application, and the latter must comprehensively consider a series of factors such as reliability, stability, and cost. UBTECH stated that the combination of large AI models and humanoid robots will greatly improve the latter's intelligence level and adaptability to multi-scenario tasks, and accelerate its industrialization process. Founder Zhou Jian also publicly stated that he hopes to complete the first batch of humanoid robots in the factory by the end of this year, pass tests, and prepare for the mass outbreak of humanoid robots in 2025. In addition, by the end of this year, UBTECH plans to launch the first-generation home-side emotional companion humanoid robot. The robot will be equipped with a large model and can interact with users and form short-term and long-term memories.
Reference link
https://www.stcn.com/article/detail/1164967.html
THE END
Please contact this public for reprinting Authorized by No.
To submit articles or seek coverage: content@jiqizhixin.com
The above is the detailed content of After 2 months, the humanoid robot Walker S can fold clothes. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

On July 29, at the roll-off ceremony of AITO Wenjie's 400,000th new car, Yu Chengdong, Huawei's Managing Director, Chairman of Terminal BG, and Chairman of Smart Car Solutions BU, attended and delivered a speech and announced that Wenjie series models will be launched this year In August, Huawei Qiankun ADS 3.0 version was launched, and it is planned to successively push upgrades from August to September. The Xiangjie S9, which will be released on August 6, will debut Huawei’s ADS3.0 intelligent driving system. With the assistance of lidar, Huawei Qiankun ADS3.0 version will greatly improve its intelligent driving capabilities, have end-to-end integrated capabilities, and adopt a new end-to-end architecture of GOD (general obstacle identification)/PDP (predictive decision-making and control) , providing the NCA function of smart driving from parking space to parking space, and upgrading CAS3.0

DeepSeek is a powerful intelligent search and analysis tool that provides two access methods: web version and official website. The web version is convenient and efficient, and can be used without installation; the official website provides comprehensive product information, download resources and support services. Whether individuals or corporate users, they can easily obtain and analyze massive data through DeepSeek to improve work efficiency, assist decision-making and promote innovation.

According to industry insider Mark Gurman, Apple’s Apple Intelligence will be postponed to October. In other words, it will be pushed first on iOS18.1. Apple iPhone 16 is expected to be released in September, so Apple Intelligence will not be pre-installed. 1. Apple Intelligence Apple Intelligence is a personal intelligence system that uses a powerful generative model to provide new functions for iPhone, iPad and Mac to assist users in communicating, working and expressing. 2. Natural language understanding The large model embedded in Apple Intelligence has a deep understanding of the meaning of language.

Early this morning, Apple pushed the first developer beta versions of iOS 18.1, iPadOS 18.1, and macOS Sequoia 15.1 to developers. At the same time, Apple also pushed the official versions of iOS17.6, iPadOS17.6, visionOS1.3, macOS14.6, tvOS17.6, and watchOS10.6. iOS18.1Beta1 finally launched the call recording function, and the Apple Intelligence function was also launched in regions outside China and the EU. The version number of iOS18.1Beta1 is 22B5007p, and the OTA upgrade is about 637MB (the size of the update package varies slightly for different models) | More

Apple has long adhered to the non-folding screen strategy and appears to be unique. But recently, rumors that Apple is about to enter the field of folding screens have gradually heated up. According to the latest news from the supply chain, Apple is preparing to launch a folding screen iPhone, and it is expected that 2026 may become a key time window. Prospects for the future development of the iPhone. Increased investment in imaging technology and AI large model applications. Full application of high refresh rate screens lags behind folding screen iPhone supply chain. Preparing to launch mid- to long-term planning for folding screen iPhones. Rather than a recent release, the launch date may be as early as 2026. Learn from Samsung’s ZFlip Series folding form variables and significance The possibility of project adjustment or even cancellation The successful release will become a milestone in the development of iPhone

With the popularity of cryptocurrencies, virtual currency trading platforms have emerged. The top ten virtual currency trading platforms in the world are ranked as follows according to transaction volume and market share: Binance, Coinbase, FTX, KuCoin, Crypto.com, Kraken, Huobi, Gate.io, Bitfinex, Gemini. These platforms offer a wide range of services, ranging from a wide range of cryptocurrency choices to derivatives trading, suitable for traders of varying levels.

This article introduces six popular AI tools, including Douyin Doubao, Wenxin Yige, Tencent Zhiying, Baidu Feipiao EasyDL, Baidu AI Studio and iFlytek Spark Cognitive Large Model. These tools cover different functions such as text creation, image generation, video editing, and AI model development. Choosing the right AI tool requires consideration of factors such as functional requirements, technical level, and cost budget. These tools provide convenient and efficient solutions for individuals and businesses in need of AI assistance.

AI tools include: Doubao, ChatGPT, Gemini, BlenderBot, etc.
