LlamainDex进行研究和写作的多代理工作流程-人工智能-PHP中文网

大语言模型代理是自动化搜索，内容生成和质量审查等任务的强大工具。但是，单个代理通常无法有效地做所有事情，尤其是当您需要集成外部资源（例如Web搜索）和多个专业步骤（例如，起草与审查）时。多代理工作流程使您可以将这些任务分配给不同的代理，每个代理都有自己的工具，约束和职责。在本文中，我们将研究如何构建一个三个代理系统（研究，写作和审查），每个代理都处理在Internet上创建简洁历史报告的特定部分。我们还将确保系统不会陷入搜索循环，这可能会浪费时间和学分。

学习目标

学习建立一个三个代理系统，以进行研究，写作和审查任务。

实施保障措施，以防止自动化工作流程中的无限搜索循环。

探索诸如DuckDuckgo之类的外部工具的集成以进行有效的数据检索。

开发一个由LLM驱动的工作流程，可确保结构化和高质量的内容生成。

>本文是

> > data Science Blogathon的一部分。内容表 语言模型（LLM） - OpenAI GPT-4

>工作流的基本工具

>
代理工作流 - 与避免进行任务执行
常见问题

工作流的基本工具

>工具是代理可以打电话以在其语言建模之外执行操作的功能。典型的工具包括：

###############################################################################
# 1. INSTALLATION
###############################################################################
# Make sure you have the following installed:
#   pip install llama-index langchain duckduckgo-search

###############################################################################
# 2. IMPORTS
###############################################################################
%pip install llama-index langchain duckduckgo-search

from llama_index.llms.openai import OpenAI

# For DuckDuckGo search via LangChain
from langchain.utilities import DuckDuckGoSearchAPIWrapper

# llama-index workflow classes
from llama_index.core.workflow import Context
from llama_index.core.agent.workflow import (
    FunctionAgent,
    AgentWorkflow,
    AgentInput,
    AgentOutput,
    ToolCall,
    ToolCallResult,
    AgentStream
)

import asyncio

###############################################################################
# 3. CREATE LLM
###############################################################################
# Replace "sk-..." with your actual OpenAI API key
llm = OpenAI(model="gpt-4", api_key="OPENAI_API_KEY")

登录后复制

> Web搜索

读取/写文件
数学计算器外部服务的
API>
在我们的示例中，关键工具是DuckDuckgoSearch，它在引擎盖下使用Langchain的Duckduckgosearchapiwrapper。我们还拥有辅助工具来记录笔记，编写报告并进行审查。> 定义任务执行的AI代理

每个代理是

> functionagent

的实例。关键字段包括：

###############################################################################
# 4. DEFINE DUCKDUCKGO SEARCH TOOL WITH SAFEGUARDS
###############################################################################
# We wrap LangChain's DuckDuckGoSearchAPIWrapper with our own logic
# to prevent repeated or excessive searches.

duckduckgo = DuckDuckGoSearchAPIWrapper()

MAX_SEARCH_CALLS = 2
search_call_count = 0
past_queries = set()

async def safe_duckduckgo_search(query: str) -> str:
    """
    A DuckDuckGo-based search function that:
      1) Prevents more than MAX_SEARCH_CALLS total searches.
      2) Skips duplicate queries.
    """
    global search_call_count, past_queries

    # Check for duplicate queries
    if query in past_queries:
        return f"Already searched for '{query}'. Avoiding duplicate search."

    # Check if we've reached the max search calls
    if search_call_count >= MAX_SEARCH_CALLS:
        return "Search limit reached, no more searches allowed."

    # Otherwise, perform the search
    search_call_count += 1
    past_queries.add(query)

    # DuckDuckGoSearchAPIWrapper.run(...) is synchronous, but we have an async signature
    result = duckduckgo.run(query)
    return str(result)
    
###############################################################################
# 5. OTHER TOOL FUNCTIONS: record_notes, write_report, review_report
###############################################################################
async def record_notes(ctx: Context, notes: str, notes_title: str) -> str:
    """Store research notes under a given title in the shared context."""
    current_state = await ctx.get("state")
    if "research_notes" not in current_state:
        current_state["research_notes"] = {}
    current_state["research_notes"][notes_title] = notes
    await ctx.set("state", current_state)
    return "Notes recorded."

async def write_report(ctx: Context, report_content: str) -> str:
    """Write a report in markdown, storing it in the shared context."""
    current_state = await ctx.get("state")
    current_state["report_content"] = report_content
    await ctx.set("state", current_state)
    return "Report written."

async def review_report(ctx: Context, review: str) -> str:
    """Review the report and store feedback in the shared context."""
    current_state = await ctx.get("state")
    current_state["review"] = review
    await ctx.set("state", current_state)
    return "Report reviewed."

登录后复制

> name 和描述
system_prompt：指示代理有关其角色和约束
llm：使用的语言模型
工具：哪个功能代理可以调用
can_handoff_to ：该代理可以将控制权交给

研究成分

搜索Web（最多可查询的指定限制）
将相关发现保存为“笔记”
>

writeagent

###############################################################################
# 1. INSTALLATION
###############################################################################
# Make sure you have the following installed:
#   pip install llama-index langchain duckduckgo-search

###############################################################################
# 2. IMPORTS
###############################################################################
%pip install llama-index langchain duckduckgo-search

from llama_index.llms.openai import OpenAI

# For DuckDuckGo search via LangChain
from langchain.utilities import DuckDuckGoSearchAPIWrapper

# llama-index workflow classes
from llama_index.core.workflow import Context
from llama_index.core.agent.workflow import (
    FunctionAgent,
    AgentWorkflow,
    AgentInput,
    AgentOutput,
    ToolCall,
    ToolCallResult,
    AgentStream
)

import asyncio

###############################################################################
# 3. CREATE LLM
###############################################################################
# Replace "sk-..." with your actual OpenAI API key
llm = OpenAI(model="gpt-4", api_key="OPENAI_API_KEY")

登录后复制

代理工作流 - 协调任务执行

> AgentWorkflow协调消息和状态在代理之间的移动方式。当用户启动请求时（例如，“给我写一份关于Internet历史的简洁报告……”），工作流程：

：

> researchAgent接收用户提示，并决定是执行Web搜索还是记录一些笔记。
> writeagent>使用注释创建结构化或样式的输出（例如Markdown文档）。
>评论检查最终输出，然后将其发送回修订或批准。> 一旦批准了内容且不要求进一步更改，工作流程结束了。

在此步骤中，我们定义了代理工作流程，其中包括研究，写作和审查代理。 root_agent设置为Research_agent，这意味着该过程始于收集研究。初始状态包含用于研究笔记，报告内容和审查状态的占位符。

运行工作流程

>使用用户请求执行工作流程，该请求指定主题和报告中要涵盖的要点。此示例中的请求要求提供有关互联网历史的简明报告，包括其起源，万维网的发展以及其现代发展。工作流程通过协调代理来处理此请求。

###############################################################################
# 4. DEFINE DUCKDUCKGO SEARCH TOOL WITH SAFEGUARDS
###############################################################################
# We wrap LangChain's DuckDuckGoSearchAPIWrapper with our own logic
# to prevent repeated or excessive searches.

duckduckgo = DuckDuckGoSearchAPIWrapper()

MAX_SEARCH_CALLS = 2
search_call_count = 0
past_queries = set()

async def safe_duckduckgo_search(query: str) -> str:
    """
    A DuckDuckGo-based search function that:
      1) Prevents more than MAX_SEARCH_CALLS total searches.
      2) Skips duplicate queries.
    """
    global search_call_count, past_queries

    # Check for duplicate queries
    if query in past_queries:
        return f"Already searched for '{query}'. Avoiding duplicate search."

    # Check if we've reached the max search calls
    if search_call_count >= MAX_SEARCH_CALLS:
        return "Search limit reached, no more searches allowed."

    # Otherwise, perform the search
    search_call_count += 1
    past_queries.add(query)

    # DuckDuckGoSearchAPIWrapper.run(...) is synchronous, but we have an async signature
    result = duckduckgo.run(query)
    return str(result)
    
###############################################################################
# 5. OTHER TOOL FUNCTIONS: record_notes, write_report, review_report
###############################################################################
async def record_notes(ctx: Context, notes: str, notes_title: str) -> str:
    """Store research notes under a given title in the shared context."""
    current_state = await ctx.get("state")
    if "research_notes" not in current_state:
        current_state["research_notes"] = {}
    current_state["research_notes"][notes_title] = notes
    await ctx.set("state", current_state)
    return "Notes recorded."

async def write_report(ctx: Context, report_content: str) -> str:
    """Write a report in markdown, storing it in the shared context."""
    current_state = await ctx.get("state")
    current_state["report_content"] = report_content
    await ctx.set("state", current_state)
    return "Report written."

async def review_report(ctx: Context, review: str) -> str:
    """Review the report and store feedback in the shared context."""
    current_state = await ctx.get("state")
    current_state["review"] = review
    await ctx.set("state", current_state)
    return "Report reviewed."

登录后复制

>用于调试或观察的流动事件

要监视工作流的执行，我们流式传输事件并打印有关代理活动的详细信息。这使我们能够跟踪当前正在工作的代理，查看中间输出并检查代理进行的工具调用。显示调试信息，例如工具使用和响应，以提高可见性。

###############################################################################
# 1. INSTALLATION
###############################################################################
# Make sure you have the following installed:
#   pip install llama-index langchain duckduckgo-search

###############################################################################
# 2. IMPORTS
###############################################################################
%pip install llama-index langchain duckduckgo-search

from llama_index.llms.openai import OpenAI

# For DuckDuckGo search via LangChain
from langchain.utilities import DuckDuckGoSearchAPIWrapper

# llama-index workflow classes
from llama_index.core.workflow import Context
from llama_index.core.agent.workflow import (
    FunctionAgent,
    AgentWorkflow,
    AgentInput,
    AgentOutput,
    ToolCall,
    ToolCallResult,
    AgentStream
)

import asyncio

###############################################################################
# 3. CREATE LLM
###############################################################################
# Replace "sk-..." with your actual OpenAI API key
llm = OpenAI(model="gpt-4", api_key="OPENAI_API_KEY")

登录后复制

检索并打印最终报告

>工作流完成后，我们提取最终状态，其中包含生成的报告。印刷报告内容，然后是评论代理的任何审核反馈。这样可以确保输出完成，并在必要时可以进一步完善。

###############################################################################
# 4. DEFINE DUCKDUCKGO SEARCH TOOL WITH SAFEGUARDS
###############################################################################
# We wrap LangChain's DuckDuckGoSearchAPIWrapper with our own logic
# to prevent repeated or excessive searches.

duckduckgo = DuckDuckGoSearchAPIWrapper()

MAX_SEARCH_CALLS = 2
search_call_count = 0
past_queries = set()

async def safe_duckduckgo_search(query: str) -> str:
    """
    A DuckDuckGo-based search function that:
      1) Prevents more than MAX_SEARCH_CALLS total searches.
      2) Skips duplicate queries.
    """
    global search_call_count, past_queries

    # Check for duplicate queries
    if query in past_queries:
        return f"Already searched for '{query}'. Avoiding duplicate search."

    # Check if we've reached the max search calls
    if search_call_count >= MAX_SEARCH_CALLS:
        return "Search limit reached, no more searches allowed."

    # Otherwise, perform the search
    search_call_count += 1
    past_queries.add(query)

    # DuckDuckGoSearchAPIWrapper.run(...) is synchronous, but we have an async signature
    result = duckduckgo.run(query)
    return str(result)
    
###############################################################################
# 5. OTHER TOOL FUNCTIONS: record_notes, write_report, review_report
###############################################################################
async def record_notes(ctx: Context, notes: str, notes_title: str) -> str:
    """Store research notes under a given title in the shared context."""
    current_state = await ctx.get("state")
    if "research_notes" not in current_state:
        current_state["research_notes"] = {}
    current_state["research_notes"][notes_title] = notes
    await ctx.set("state", current_state)
    return "Notes recorded."

async def write_report(ctx: Context, report_content: str) -> str:
    """Write a report in markdown, storing it in the shared context."""
    current_state = await ctx.get("state")
    current_state["report_content"] = report_content
    await ctx.set("state", current_state)
    return "Report written."

async def review_report(ctx: Context, review: str) -> str:
    """Review the report and store feedback in the shared context."""
    current_state = await ctx.get("state")
    current_state["review"] = review
    await ctx.set("state", current_state)
    return "Report reviewed."

登录后复制

LlamainDex进行研究和写作的多代理工作流程

避免无限搜索循环

使用Web搜索工具时，LLM可能会“混淆”并反复调用搜索功能。这可能导致不必要的成本或时间消耗。为了防止这种情况，我们使用了两种机制：

Hard Limitwe Setmax_search_calls = 2，因此研究工具只能称为两次。
>重复dentectionwe在集合中存储过去的查询（past_queries），以避免多次重复完全相同的搜索。

如果满足了任何一个条件（最大搜索或重复查询），我们的函数将返回罐头消息，而不是执行新搜索。>

期望什么？

研究

>可能会执行最多两个不同的DuckDuckgo搜索（例如，“互联网的历史记录”和“ World Web Web Tim Berners-Lee”等），然后致电record_notes存储摘要。

writeagent

>从共享上下文中读取“ research_notes”。

草稿简短的Markdown报告。
移交给审查。

评估内容。

如果需要更改，则可以将控制权传递回写入。否则，它批准了报告。

工作流结束

最终输出存储在

final_state [“ report_content”]。

结论 >通过将工作流程分为不同的代理，以获取

> search

，

写作

和>评论>，您可以创建一个功能强大的模块化系统，该系统： >收集相关信息（以一种受控的方式，防止搜索过多）

产生结构化的高质量输出

>自我检查是否准确性和完整性
>
钥匙要点
- 多代理工作流程提高了效率。
- 使用DuckDuckGo之类的外部工具增强了LLM代理的研究功能。
- 实施约束，例如搜索限制，可以防止不必要的资源消耗。
- >协调的代理工作流程确保结构化的高质量内容生成。
> Q1。为什么要使用多个代理代替单一的通用代理？跨代理商（研究，写作，审查）分配职责可确保每个步骤都明确定义且易于管理。它还减少了模型决策中的混乱，并促进了更准确的结构化输出。我如何限制Web搜索的数量？在代码中，我们使用全局计数器（search_call_count）和一个常数（max_search_calls = 2）。每当搜索代理调用safe_duckduckgo_search时，它都会检查计数器是否达到限制。如果是这样，它将返回一条消息，而不是执行另一个搜索。
Q3。如果代理多次重复相同的查询怎么办？我们维护一个名为cast_queries的python集，以检测重复的查询。如果查询已经在该集合中，则该工具将跳过执行实际搜索并返回简短消息，从而阻止重复查询运行。
Q4。我可以更改提示以适应此工作流程的其他主题或样式吗？绝对地。您可以编辑每个代理的System_prompt，以根据所需的域或写作方式量身定制说明。例如，您可以指示写入列表，叙述性论文或技术摘要。
Q5。我需要GPT-4，还是可以使用其他型号？您可以将OpenAI（Model =“ GPT-4”）交换为由Llama-Index支持的另一个模型（例如GPT-3.5，甚至是本地模型）。该体系结构保持不变，尽管某些模型可能会产生不同的质量输出。>
>本文所示的媒体不归Analytics Vidhya拥有，并由作者的酌情决定使用。