Table of Contents
Method introduction
Feasibility Verification
Summary
Team related research
Home Technology peripherals AI ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

Nov 14, 2023 pm 09:37 PM
data train

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

  • Project address: https://github.com/OpenBMB/ProAgent
  • Paper address: https://github.com/OpenBMB/ProAgent/blob/main/paper/paper.pdf

In the development of human technology Throughout history, automation has been the main driving force, helping humans to free themselves from complex, dangerous, and tedious labor environments. From waterwheel irrigation in the early agricultural era to steam engines in the industrial era, humans have been constantly pursuing more advanced automation technologies to liberate themselves from heavy work

With the information age With the arrival of , software, as the basis for information processing, storage and communication, has become an inseparable part of human production and life, thus catalyzing the formation of Robotic Process Automation (RPA) technology. It coordinates multiple software into a solidified workflow (Workflow) through manually compiled rules, and interacts with software to achieve efficient execution by simulating human interaction.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

In this diagram we compare Robotic Process Automation (RPA) with Agent Process Automation (APA)

RPA (Robotic Process Automation) uses software robots or "BOTs" to simulate and perform repetitive and regular tasks to free up human resources and improve work efficiency. The application range of RPA is very wide. Many enterprises (including banks, insurance companies, manufacturing, retail and other industries) usually use RPA robots to automate routine and tedious tasks, such as data entry, data extraction, and data processing. By automating tasks, RPA can significantly reduce error rates and be able to perform tasks 24*7, thereby improving business reliability and responsiveness

According to market research, the RPA market is growing rapidly and achieving great success. Gartner predicts that global RPA market revenue will reach US$3.3 billion by 2023, with a growth rate of 17.5%. This shows that enterprises have a very high demand and recognition for RPA

However, RPA can only replace simple, mechanical human work, and some complex processes still rely on manual labor:

  1. Writing RPA workflow itself requires heavy human labor and is costly.
  2. Complex tasks are very flexible and usually involve dynamic decision-making, which is difficult to solidify into rules for expression.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

Figure 2 Comparison of efficiency and intelligence between RPA and APA

Fortunately, The recent emergence of large language model agent technology (Large Language Model based Agents, LLM-based Agents) in the field of AI may create new possibilities for automation technology. Is it possible to introduce the flexibility of Agent technology into the RPA field to further reduce human participation?

The team's research explores the new automation paradigm "Agentic Process Automation" (APA) in the era of large-model agents. Compared with traditional RPA, in the APA paradigm, the Agent can autonomously complete the workflow construction according to human needs. At the same time, it can identify the parts of human needs that require dynamic decision-making, automatically orchestrate them into the workflow, and execute the workflow when the workflow is executed. This part actively takes over the execution of the workflow to complete corresponding complex decisions.

In order to explore the possibilities of APA, this research work implemented an automated agent ProAgent, which can receive human instructions and build workflows by generating code while also being in the workflow DataAgent and ControlAgent are introduced to implement complex data processing and logical control in workflow. ProAgent's research demonstrates the feasibility of APA in the era of large-model agents, and also reveals new possibilities for automation technology in the era of LLM.

Method introduction

In RPA, the workflow is a graph structure composed of a series of tool calls: nodes represent atomic tool calls (such as Gmail, Twitter, Google Sheets), while edges represent the logical sequence of execution (connection, branch, loop). A workflow usually contains all prior knowledge of a task or a type of task, including problem solving paths and exception handling logic. Therefore, writing fixed workflows is often very stable, thorough and efficient

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

Figure 3 Example of agent workflow description language

In ProAgent, since LLM itself is pre-trained in the code data , and learned strong coding capabilities, this research is based on the code-based Agentic Workflow Description Language. This language uses JSON to organize and manage data in the workflow, and uses Python syntax to implement logical control of the workflow. Jumps, loops, etc. in the control flow are directly represented through Python syntax, while the tools in the workflow are The call is encapsulated as a Python Function. So for ProAgent, workflow building tasks are transformed into code generation tasks. When receiving human instructions, ProAgent writes the corresponding Agentic Workflow Description Language, thereby realizing automated workflow construction.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

Figure 4 Example of agent workflow description language combining DataAgent and ControlAgent

Complex reality Tasks usually involve dynamic decision-making, and simple Python-style logic control rules and JSON-style data organization are ineffective when facing flexible needs. At this time, agents need to be introduced. Therefore, this research work further defines two Agent operations:

1. DataAgent: For a complex data processing requirement, natural language will be used to describe the processing when building the workflow. The task will then initialize a DataAgent when executed, which will autonomously process and complete the data processing task based on the natural language description.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

2. ControlAgent: For logical control rules that are difficult to express with rules, use natural language to describe the control logic when building the workflow, and then A ControlAgent will be initialized at runtime, which will autonomously select the branch that needs to be executed later in the workflow based on the natural language description.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

ProAgent uses ReACT mode to build workflow step by step, which contains four workflow construction steps:

  1. Action_Define: Decide what tools to add to the workflow.
  2. Action Implement: Convert the input/output parameters of the tool into a JSON structure, and encapsulate the call of the tool into a Python function.
  3. Workflow Implement: Define a mainWorkflow function to organize the logic control and data processing of the entire workflow.
  4. Task Submit: When ProAgent completes building the workflow, this operation identifies the end of the build process.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

The example shows Figure 5 of the ProAgent workflow building process

In addition, In order to optimize the effect of ProAgent, several optimization techniques are introduced:

  1. 1.Testing-on-Constructing: During the construction process, ProAgent will modify the workflow once Test the workflow to ensure its correctness.
  2. Function Calling: All operations of workflow construction are encapsulated into GPT-4 Functions, thereby improving control over the workflow construction process.
  3. Chain-of-Thought: When ProAgent writes workflow code, it needs to give comments and a writing plan for each function to improve the performance of ProAgent workflow construction. .

The workflow execution process is based on the Python interpreter. When a workflow is given, the corresponding mainWorkflow function is used as the entry point for execution, thus starting the entire execution process. The execution process follows the execution rules of Python code, that is, it is executed line by line in order. Once the mainWorkflow function returns, execution of the workflow has completed successfully

Feasibility Verification

In order to verify the feasibility of Agentic Process Automation, this research uses OpenAI GPT-4 as the basic model and an open source RPA platform n8n as The carrier implements the above-mentioned ProAgent. At the same time, we designed a task that requires both flexibility and efficiency: this is a typical business scenario, which requires extracting profit data of various business lines from Google Sheets, and determining subsequent actions based on whether the business is 2B or 2C. Once the line of business is determined to be 2C, a message is sent to the Slack channel. For business lines in 2B, an email is sent to the respective manager, which includes an assessment of the business line and a brief profitability overview.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

Figure 6 Task Instruction Display

The content that needs to be rewritten is: For this task , First of all, it is a repetitive task, and the same process should be adopted for multiple product lines. Secondly, it is very difficult to distinguish whether a business line is 2C or 2B, and it requires dynamic decision-making by the Agent to determine the subsequent workflow. Finally, writing the evaluation email of the business line requires a certain amount of intelligence, so the intervention of the Agent is required

In the ProAgent generation, for this task, a program containing four atomic operations was written. Workflow for a DataAgent and a ControlAgent. The overall process is roughly as shown in the figure below:

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universitiesFigure 7 ProAgent workflow construction process display

It can be seen that ProAgent automatically The way of writing code automatically completes the workflow construction process without manual intervention. When it is necessary to determine whether the business line is 2B or 2C, ProAgent introduces ControlAgent to make the judgment. The Prompt of ControlAgent is set to "Decide Whether the business line is toC or toB". When the business line is 2B, ProAgent also introduces a DataAgent, whose task is set to "Write an email of the business line of profit, together with your suggestion", thus using the intelligence of the agent to write based on the actual situation of different business lines. mail.

After the workflow is written and solidified, the workflow will automatically branch to different logic according to different data for efficient data processing.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

Figure 8 ProAgent workflow execution process display

When processing 2C business line data, ControlAgent You can determine the type of the current business line based on the business line description and choose to use the Slack tool for communication. When processing 2B business line data, DataAgent can compose an email and send it to the corresponding manager's mailbox

Summary

This study proposes A new automation paradigm - Agentic Process Automation is developed, suitable for the era of large models. Compared with traditional Robotic Process Automation technology, Agentic Process Automation can automate the construction of workflows and realize the automation of dynamic decisions during workflow execution. The research also further developed ProAgent and experimentally demonstrated the feasibility and potential of large-model agents in automation. I believe that in the future, large model agent technology will help humans achieve a higher level of automation and liberate themselves from heavy labor

Currently, the research team has conducted many studies in the direction of large model agents, including:

  • XAgent: a super powerful model agent application framework that can dismantle complex tasks on its own. and execute efficiently.
  • Project address: https://github.com/OpenBMB/XAgent
  • ChatDev: a multi-agent collaborative development framework that allows multiple Agents with different roles collaborate to automatically develop software applications.
  • Project address: https://github.com/OpenBMB/ChatDev
  • AgentVerse: A general platform for large model-driven agents, recruiting A variety of agent experts work together to help users solve complex tasks.
  • Project address: https://github.com/OpenBMB/AgentVerse

The above is the detailed content of ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
Will R.E.P.O. Have Crossplay?
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Use ddrescue to recover data on Linux Use ddrescue to recover data on Linux Mar 20, 2024 pm 01:37 PM

DDREASE is a tool for recovering data from file or block devices such as hard drives, SSDs, RAM disks, CDs, DVDs and USB storage devices. It copies data from one block device to another, leaving corrupted data blocks behind and moving only good data blocks. ddreasue is a powerful recovery tool that is fully automated as it does not require any interference during recovery operations. Additionally, thanks to the ddasue map file, it can be stopped and resumed at any time. Other key features of DDREASE are as follows: It does not overwrite recovered data but fills the gaps in case of iterative recovery. However, it can be truncated if the tool is instructed to do so explicitly. Recover data from multiple files or blocks to a single

Open source! Beyond ZoeDepth! DepthFM: Fast and accurate monocular depth estimation! Open source! Beyond ZoeDepth! DepthFM: Fast and accurate monocular depth estimation! Apr 03, 2024 pm 12:04 PM

0.What does this article do? We propose DepthFM: a versatile and fast state-of-the-art generative monocular depth estimation model. In addition to traditional depth estimation tasks, DepthFM also demonstrates state-of-the-art capabilities in downstream tasks such as depth inpainting. DepthFM is efficient and can synthesize depth maps within a few inference steps. Let’s read about this work together ~ 1. Paper information title: DepthFM: FastMonocularDepthEstimationwithFlowMatching Author: MingGui, JohannesS.Fischer, UlrichPrestel, PingchuanMa, Dmytr

Google is ecstatic: JAX performance surpasses Pytorch and TensorFlow! It may become the fastest choice for GPU inference training Google is ecstatic: JAX performance surpasses Pytorch and TensorFlow! It may become the fastest choice for GPU inference training Apr 01, 2024 pm 07:46 PM

The performance of JAX, promoted by Google, has surpassed that of Pytorch and TensorFlow in recent benchmark tests, ranking first in 7 indicators. And the test was not done on the TPU with the best JAX performance. Although among developers, Pytorch is still more popular than Tensorflow. But in the future, perhaps more large models will be trained and run based on the JAX platform. Models Recently, the Keras team benchmarked three backends (TensorFlow, JAX, PyTorch) with the native PyTorch implementation and Keras2 with TensorFlow. First, they select a set of mainstream

Hello, electric Atlas! Boston Dynamics robot comes back to life, 180-degree weird moves scare Musk Hello, electric Atlas! Boston Dynamics robot comes back to life, 180-degree weird moves scare Musk Apr 18, 2024 pm 07:58 PM

Boston Dynamics Atlas officially enters the era of electric robots! Yesterday, the hydraulic Atlas just "tearfully" withdrew from the stage of history. Today, Boston Dynamics announced that the electric Atlas is on the job. It seems that in the field of commercial humanoid robots, Boston Dynamics is determined to compete with Tesla. After the new video was released, it had already been viewed by more than one million people in just ten hours. The old people leave and new roles appear. This is a historical necessity. There is no doubt that this year is the explosive year of humanoid robots. Netizens commented: The advancement of robots has made this year's opening ceremony look like a human, and the degree of freedom is far greater than that of humans. But is this really not a horror movie? At the beginning of the video, Atlas is lying calmly on the ground, seemingly on his back. What follows is jaw-dropping

Slow Cellular Data Internet Speeds on iPhone: Fixes Slow Cellular Data Internet Speeds on iPhone: Fixes May 03, 2024 pm 09:01 PM

Facing lag, slow mobile data connection on iPhone? Typically, the strength of cellular internet on your phone depends on several factors such as region, cellular network type, roaming type, etc. There are some things you can do to get a faster, more reliable cellular Internet connection. Fix 1 – Force Restart iPhone Sometimes, force restarting your device just resets a lot of things, including the cellular connection. Step 1 – Just press the volume up key once and release. Next, press the Volume Down key and release it again. Step 2 – The next part of the process is to hold the button on the right side. Let the iPhone finish restarting. Enable cellular data and check network speed. Check again Fix 2 – Change data mode While 5G offers better network speeds, it works better when the signal is weaker

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks Apr 29, 2024 pm 06:55 PM

I cry to death. The world is madly building big models. The data on the Internet is not enough. It is not enough at all. The training model looks like "The Hunger Games", and AI researchers around the world are worrying about how to feed these data voracious eaters. This problem is particularly prominent in multi-modal tasks. At a time when nothing could be done, a start-up team from the Department of Renmin University of China used its own new model to become the first in China to make "model-generated data feed itself" a reality. Moreover, it is a two-pronged approach on the understanding side and the generation side. Both sides can generate high-quality, multi-modal new data and provide data feedback to the model itself. What is a model? Awaker 1.0, a large multi-modal model that just appeared on the Zhongguancun Forum. Who is the team? Sophon engine. Founded by Gao Yizhao, a doctoral student at Renmin University’s Hillhouse School of Artificial Intelligence.

Kuaishou version of Sora 'Ke Ling' is open for testing: generates over 120s video, understands physics better, and can accurately model complex movements Kuaishou version of Sora 'Ke Ling' is open for testing: generates over 120s video, understands physics better, and can accurately model complex movements Jun 11, 2024 am 09:51 AM

What? Is Zootopia brought into reality by domestic AI? Exposed together with the video is a new large-scale domestic video generation model called "Keling". Sora uses a similar technical route and combines a number of self-developed technological innovations to produce videos that not only have large and reasonable movements, but also simulate the characteristics of the physical world and have strong conceptual combination capabilities and imagination. According to the data, Keling supports the generation of ultra-long videos of up to 2 minutes at 30fps, with resolutions up to 1080p, and supports multiple aspect ratios. Another important point is that Keling is not a demo or video result demonstration released by the laboratory, but a product-level application launched by Kuaishou, a leading player in the short video field. Moreover, the main focus is to be pragmatic, not to write blank checks, and to go online as soon as it is released. The large model of Ke Ling is already available in Kuaiying.

Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! May 06, 2024 pm 04:13 PM

The latest video of Tesla's robot Optimus is released, and it can already work in the factory. At normal speed, it sorts batteries (Tesla's 4680 batteries) like this: The official also released what it looks like at 20x speed - on a small "workstation", picking and picking and picking: This time it is released One of the highlights of the video is that Optimus completes this work in the factory, completely autonomously, without human intervention throughout the process. And from the perspective of Optimus, it can also pick up and place the crooked battery, focusing on automatic error correction: Regarding Optimus's hand, NVIDIA scientist Jim Fan gave a high evaluation: Optimus's hand is the world's five-fingered robot. One of the most dexterous. Its hands are not only tactile

See all articles