


GPT4 teaches a robot to turn a pen, which is called silky smoothness!
Recently, GPT-4, which inspired mathematician Terence Tao, started to teach robots how to turn pens in chats
The project is called Agent Eureka, developed by NVIDIA , the University of Pennsylvania, the California Institute of Technology and the University of Texas at Austin were jointly developed. Their research combines the power of the GPT-4 structure with the advantages of reinforcement learning, allowing Eureka to design exquisite reward functions.
The programming capabilities of GPT-4 give Eureka powerful reward function design skills. This means that in most tasks, Eureka’s own reward schemes are even better than those of human experts. This allows it to complete some tasks that are difficult for humans to complete, including turning pens, opening drawers, plate walnuts, and even more complex tasks, such as throwing and catching a ball, operating scissors, etc.
Picture
Picture
Although these are currently done in a simulation environment, But this is already very powerful.
The project has been open sourced, and the project address and paper address have been placed at the end of the article.
Briefly summarize the core points of the paper.
The paper explores how to use large language models (LLM) to design and optimize reward functions in machine learning. This is an important topic because designing a good reward function can greatly improve the performance of machine learning models, but designing such a function is very difficult.
The researchers proposed a new algorithm called EUREKA. EUREKA adopts LLM to generate and improve reward functions. In testing, EUREKA achieved human-level performance in 29 different reinforcement learning environments and surpassed reward functions designed by human experts in 83% of tasks
EUREKA successfully solved some previously unreachable problems Complex operation tasks solved by artificially designed reward functions, such as simulating the operation of a "Shadow Hand" hand to quickly turn a pen
In addition, EUREKA provides a brand-new method that can generate more effective, A reward function that is more consistent with human expectations
The way EUREKA works consists of three main steps:
Environment as context: EUREKA uses the source code of the environment as context to generate an executable reward function
2. Evolutionary search: EUREKA continuously proposes and improves reward functions through evolutionary search
3. Reward reflection: EUREKA generates text summaries of reward quality based on statistical data from policy training, thereby Automatic and targeted improvement of reward functions. 3. Reward reflection: EUREKA generates textual summaries of reward quality based on statistical data from policy training to automatically and targetedly improve reward functions
This research may have far-reaching implications for the fields of reinforcement learning and reward function design Impact because it provides a new and efficient way to automatically generate and improve reward functions, and the performance of this method exceeds that of human experts in many cases.
Project address: https://www.php.cn/link/e6b738eca0e6792ba8a9cbcba6c1881d
Paper link: https://www.php.cn/ link/ce128c3e8f0c0ae4b3e843dc7cbab0f7
The above is the detailed content of GPT4 teaches a robot to turn a pen, which is called silky smoothness!. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics





Written above & the author’s personal understanding: At present, in the entire autonomous driving system, the perception module plays a vital role. The autonomous vehicle driving on the road can only obtain accurate perception results through the perception module. The downstream regulation and control module in the autonomous driving system makes timely and correct judgments and behavioral decisions. Currently, cars with autonomous driving functions are usually equipped with a variety of data information sensors including surround-view camera sensors, lidar sensors, and millimeter-wave radar sensors to collect information in different modalities to achieve accurate perception tasks. The BEV perception algorithm based on pure vision is favored by the industry because of its low hardware cost and easy deployment, and its output results can be easily applied to various downstream tasks.

The humanoid robot Ameca has been upgraded to the second generation! Recently, at the World Mobile Communications Conference MWC2024, the world's most advanced robot Ameca appeared again. Around the venue, Ameca attracted a large number of spectators. With the blessing of GPT-4, Ameca can respond to various problems in real time. "Let's have a dance." When asked if she had emotions, Ameca responded with a series of facial expressions that looked very lifelike. Just a few days ago, EngineeredArts, the British robotics company behind Ameca, just demonstrated the team’s latest development results. In the video, the robot Ameca has visual capabilities and can see and describe the entire room and specific objects. The most amazing thing is that she can also

Common challenges faced by machine learning algorithms in C++ include memory management, multi-threading, performance optimization, and maintainability. Solutions include using smart pointers, modern threading libraries, SIMD instructions and third-party libraries, as well as following coding style guidelines and using automation tools. Practical cases show how to use the Eigen library to implement linear regression algorithms, effectively manage memory and use high-performance matrix operations.

Editor of Machine Power Report: Wu Xin The domestic version of the humanoid robot + large model team completed the operation task of complex flexible materials such as folding clothes for the first time. With the unveiling of Figure01, which integrates OpenAI's multi-modal large model, the related progress of domestic peers has been attracting attention. Just yesterday, UBTECH, China's "number one humanoid robot stock", released the first demo of the humanoid robot WalkerS that is deeply integrated with Baidu Wenxin's large model, showing some interesting new features. Now, WalkerS, blessed by Baidu Wenxin’s large model capabilities, looks like this. Like Figure01, WalkerS does not move around, but stands behind a desk to complete a series of tasks. It can follow human commands and fold clothes

In the field of industrial automation technology, there are two recent hot spots that are difficult to ignore: artificial intelligence (AI) and Nvidia. Don’t change the meaning of the original content, fine-tune the content, rewrite the content, don’t continue: “Not only that, the two are closely related, because Nvidia is expanding beyond just its original graphics processing units (GPUs). The technology extends to the field of digital twins and is closely connected to emerging AI technologies. "Recently, NVIDIA has reached cooperation with many industrial companies, including leading industrial automation companies such as Aveva, Rockwell Automation, Siemens and Schneider Electric, as well as Teradyne Robotics and its MiR and Universal Robots companies. Recently,Nvidiahascoll

This week, FigureAI, a robotics company invested by OpenAI, Microsoft, Bezos, and Nvidia, announced that it has received nearly $700 million in financing and plans to develop a humanoid robot that can walk independently within the next year. And Tesla’s Optimus Prime has repeatedly received good news. No one doubts that this year will be the year when humanoid robots explode. SanctuaryAI, a Canadian-based robotics company, recently released a new humanoid robot, Phoenix. Officials claim that it can complete many tasks autonomously at the same speed as humans. Pheonix, the world's first robot that can autonomously complete tasks at human speeds, can gently grab, move and elegantly place each object to its left and right sides. It can autonomously identify objects

The bottom layer of the C++sort function uses merge sort, its complexity is O(nlogn), and provides different sorting algorithm choices, including quick sort, heap sort and stable sort.

Sweeping and mopping robots are one of the most popular smart home appliances among consumers in recent years. The convenience of operation it brings, or even the need for no operation, allows lazy people to free their hands, allowing consumers to "liberate" from daily housework and spend more time on the things they like. Improved quality of life in disguised form. Riding on this craze, almost all home appliance brands on the market are making their own sweeping and mopping robots, making the entire sweeping and mopping robot market very lively. However, the rapid expansion of the market will inevitably bring about a hidden danger: many manufacturers will use the tactics of sea of machines to quickly occupy more market share, resulting in many new products without any upgrade points. It is also said that they are "matryoshka" models. Not an exaggeration. However, not all sweeping and mopping robots are
