Home > Technology peripherals > AI > ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

王林
Release: 2023-11-14 21:37:17
forward
1469 people have browsed it

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

  • Project address: https://github.com/OpenBMB/ProAgent
  • Paper address: https://github.com/OpenBMB/ProAgent/blob/main/paper/paper.pdf

In the development of human technology Throughout history, automation has been the main driving force, helping humans to free themselves from complex, dangerous, and tedious labor environments. From waterwheel irrigation in the early agricultural era to steam engines in the industrial era, humans have been constantly pursuing more advanced automation technologies to liberate themselves from heavy work

With the information age With the arrival of , software, as the basis for information processing, storage and communication, has become an inseparable part of human production and life, thus catalyzing the formation of Robotic Process Automation (RPA) technology. It coordinates multiple software into a solidified workflow (Workflow) through manually compiled rules, and interacts with software to achieve efficient execution by simulating human interaction.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

In this diagram we compare Robotic Process Automation (RPA) with Agent Process Automation (APA)

RPA (Robotic Process Automation) uses software robots or "BOTs" to simulate and perform repetitive and regular tasks to free up human resources and improve work efficiency. The application range of RPA is very wide. Many enterprises (including banks, insurance companies, manufacturing, retail and other industries) usually use RPA robots to automate routine and tedious tasks, such as data entry, data extraction, and data processing. By automating tasks, RPA can significantly reduce error rates and be able to perform tasks 24*7, thereby improving business reliability and responsiveness

According to market research, the RPA market is growing rapidly and achieving great success. Gartner predicts that global RPA market revenue will reach US$3.3 billion by 2023, with a growth rate of 17.5%. This shows that enterprises have a very high demand and recognition for RPA

However, RPA can only replace simple, mechanical human work, and some complex processes still rely on manual labor:

  1. Writing RPA workflow itself requires heavy human labor and is costly.
  2. Complex tasks are very flexible and usually involve dynamic decision-making, which is difficult to solidify into rules for expression.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

Figure 2 Comparison of efficiency and intelligence between RPA and APA

Fortunately, The recent emergence of large language model agent technology (Large Language Model based Agents, LLM-based Agents) in the field of AI may create new possibilities for automation technology. Is it possible to introduce the flexibility of Agent technology into the RPA field to further reduce human participation?

The team's research explores the new automation paradigm "Agentic Process Automation" (APA) in the era of large-model agents. Compared with traditional RPA, in the APA paradigm, the Agent can autonomously complete the workflow construction according to human needs. At the same time, it can identify the parts of human needs that require dynamic decision-making, automatically orchestrate them into the workflow, and execute the workflow when the workflow is executed. This part actively takes over the execution of the workflow to complete corresponding complex decisions.

In order to explore the possibilities of APA, this research work implemented an automated agent ProAgent, which can receive human instructions and build workflows by generating code while also being in the workflow DataAgent and ControlAgent are introduced to implement complex data processing and logical control in workflow. ProAgent's research demonstrates the feasibility of APA in the era of large-model agents, and also reveals new possibilities for automation technology in the era of LLM.

Method introduction

In RPA, the workflow is a graph structure composed of a series of tool calls: nodes represent atomic tool calls (such as Gmail, Twitter, Google Sheets), while edges represent the logical sequence of execution (connection, branch, loop). A workflow usually contains all prior knowledge of a task or a type of task, including problem solving paths and exception handling logic. Therefore, writing fixed workflows is often very stable, thorough and efficient

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

Figure 3 Example of agent workflow description language

In ProAgent, since LLM itself is pre-trained in the code data , and learned strong coding capabilities, this research is based on the code-based Agentic Workflow Description Language. This language uses JSON to organize and manage data in the workflow, and uses Python syntax to implement logical control of the workflow. Jumps, loops, etc. in the control flow are directly represented through Python syntax, while the tools in the workflow are The call is encapsulated as a Python Function. So for ProAgent, workflow building tasks are transformed into code generation tasks. When receiving human instructions, ProAgent writes the corresponding Agentic Workflow Description Language, thereby realizing automated workflow construction.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

Figure 4 Example of agent workflow description language combining DataAgent and ControlAgent

Complex reality Tasks usually involve dynamic decision-making, and simple Python-style logic control rules and JSON-style data organization are ineffective when facing flexible needs. At this time, agents need to be introduced. Therefore, this research work further defines two Agent operations:

1. DataAgent: For a complex data processing requirement, natural language will be used to describe the processing when building the workflow. The task will then initialize a DataAgent when executed, which will autonomously process and complete the data processing task based on the natural language description.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

2. ControlAgent: For logical control rules that are difficult to express with rules, use natural language to describe the control logic when building the workflow, and then A ControlAgent will be initialized at runtime, which will autonomously select the branch that needs to be executed later in the workflow based on the natural language description.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

ProAgent uses ReACT mode to build workflow step by step, which contains four workflow construction steps:

  1. Action_Define: Decide what tools to add to the workflow.
  2. Action Implement: Convert the input/output parameters of the tool into a JSON structure, and encapsulate the call of the tool into a Python function.
  3. Workflow Implement: Define a mainWorkflow function to organize the logic control and data processing of the entire workflow.
  4. Task Submit: When ProAgent completes building the workflow, this operation identifies the end of the build process.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

The example shows Figure 5 of the ProAgent workflow building process

In addition, In order to optimize the effect of ProAgent, several optimization techniques are introduced:

  1. 1.Testing-on-Constructing: During the construction process, ProAgent will modify the workflow once Test the workflow to ensure its correctness.
  2. Function Calling: All operations of workflow construction are encapsulated into GPT-4 Functions, thereby improving control over the workflow construction process.
  3. Chain-of-Thought: When ProAgent writes workflow code, it needs to give comments and a writing plan for each function to improve the performance of ProAgent workflow construction. .

The workflow execution process is based on the Python interpreter. When a workflow is given, the corresponding mainWorkflow function is used as the entry point for execution, thus starting the entire execution process. The execution process follows the execution rules of Python code, that is, it is executed line by line in order. Once the mainWorkflow function returns, execution of the workflow has completed successfully

Feasibility Verification

In order to verify the feasibility of Agentic Process Automation, this research uses OpenAI GPT-4 as the basic model and an open source RPA platform n8n as The carrier implements the above-mentioned ProAgent. At the same time, we designed a task that requires both flexibility and efficiency: this is a typical business scenario, which requires extracting profit data of various business lines from Google Sheets, and determining subsequent actions based on whether the business is 2B or 2C. Once the line of business is determined to be 2C, a message is sent to the Slack channel. For business lines in 2B, an email is sent to the respective manager, which includes an assessment of the business line and a brief profitability overview.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

Figure 6 Task Instruction Display

The content that needs to be rewritten is: For this task , First of all, it is a repetitive task, and the same process should be adopted for multiple product lines. Secondly, it is very difficult to distinguish whether a business line is 2C or 2B, and it requires dynamic decision-making by the Agent to determine the subsequent workflow. Finally, writing the evaluation email of the business line requires a certain amount of intelligence, so the intervention of the Agent is required

In the ProAgent generation, for this task, a program containing four atomic operations was written. Workflow for a DataAgent and a ControlAgent. The overall process is roughly as shown in the figure below:

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universitiesFigure 7 ProAgent workflow construction process display

It can be seen that ProAgent automatically The way of writing code automatically completes the workflow construction process without manual intervention. When it is necessary to determine whether the business line is 2B or 2C, ProAgent introduces ControlAgent to make the judgment. The Prompt of ControlAgent is set to "Decide Whether the business line is toC or toB". When the business line is 2B, ProAgent also introduces a DataAgent, whose task is set to "Write an email of the business line of profit, together with your suggestion", thus using the intelligence of the agent to write based on the actual situation of different business lines. mail.

After the workflow is written and solidified, the workflow will automatically branch to different logic according to different data for efficient data processing.

ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities

Figure 8 ProAgent workflow execution process display

When processing 2C business line data, ControlAgent You can determine the type of the current business line based on the business line description and choose to use the Slack tool for communication. When processing 2B business line data, DataAgent can compose an email and send it to the corresponding manager's mailbox

Summary

This study proposes A new automation paradigm - Agentic Process Automation is developed, suitable for the era of large models. Compared with traditional Robotic Process Automation technology, Agentic Process Automation can automate the construction of workflows and realize the automation of dynamic decisions during workflow execution. The research also further developed ProAgent and experimentally demonstrated the feasibility and potential of large-model agents in automation. I believe that in the future, large model agent technology will help humans achieve a higher level of automation and liberate themselves from heavy labor

Team related research

Currently, the research team has conducted many studies in the direction of large model agents, including:

  • XAgent: a super powerful model agent application framework that can dismantle complex tasks on its own. and execute efficiently.
  • Project address: https://github.com/OpenBMB/XAgent
  • ChatDev: a multi-agent collaborative development framework that allows multiple Agents with different roles collaborate to automatically develop software applications.
  • Project address: https://github.com/OpenBMB/ChatDev
  • AgentVerse: A general platform for large model-driven agents, recruiting A variety of agent experts work together to help users solve complex tasks.
  • Project address: https://github.com/OpenBMB/AgentVerse

The above is the detailed content of ProAgent: Intelligent agents led by OpenAI liberate manpower, released by Tsinghua University and other universities. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:51cto.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template