使用 Node.js 建立 ReAct AI 代理程式（維基百科搜尋）en-js教程-PHP中文網

Criando um Agente de IA ReAct com Node.js (pesquisa na Wikipedia ) pt-br

Einführung

Wir werden einen KI-Agenten erstellen, der in der Lage ist, Wikipedia zu durchsuchen und Fragen basierend auf den gesammelten Informationen zu beantworten.
Dieser ReAct-Agent (Reasoning and Action) verwendet die Google Generative AI API, um Abfragen zu verarbeiten und Antworten zu generieren.

Unser Agent kann:

Suchen Sie nach relevanten Informationen auf Wikipedia.
Bestimmte Abschnitte aus Wikipedia-Seiten extrahieren.
Begründen Sie die gesammelten Informationen und formulieren Sie Antworten.

[2] Was ist ein ReAct-Agent?

Ein ReAct-Agent ist ein bestimmter Agententyp, der einem Reflexions-Aktions-Zyklus folgt. Es reflektiert die aktuelle Aufgabe auf der Grundlage der verfügbaren Informationen und der möglichen Maßnahmen und entscheidet dann, welche Maßnahmen ergriffen werden sollen oder ob die Aufgabe abgeschlossen werden soll.

[3] Planung des Agenten

3.1 Erforderliche Werkzeuge

Node.js
Axios-Bibliothek für HTTP-Anfragen
Google Generative AI API (gemini-1.5-flash)
Wikipedia-API

3.2 Agentenstruktur

Unser ReAct Agent wird drei Hauptzustände haben:

GEDANKE (Reflexion)
AKTION (Ausführung)
ANTWORT (Antwort)

3.3 Denkstand

Der Denkzustand ist der Moment, in dem ReactAgent über die gesammelten Informationen nachdenkt und entscheidet, was der nächste Schritt sein soll.

async thought() {
    // ...
}

登入後複製

3.4 Aktionsstatus (ACTION)

Im Aktionszustand führt der Agent eine der verfügbaren Funktionen basierend auf dem vorherigen Gedanken aus.
Beachten Sie, dass es die Aktion (Ausführung) und die Entscheidung (welche Aktion) gibt.

async action() {
    // chama a decisão
    // executa a ação e retorna um ActionResult
}

async decideAction() {
    // Chama o LLM com base no Pensamento (reflexão) para formatar e adequar a chamada de função.
    // Procure por um modo de função-ferramenta na [documentação da API do Google](https://ai.google.dev/gemini-api/docs/function-calling)
}

登入後複製

[4] Implementierung des Agenten

Lassen Sie uns Schritt für Schritt den ReAct Agent erstellen und dabei jeden Zustand hervorheben.

4.1 Erstkonfiguration

Konfigurieren Sie zunächst das Projekt und installieren Sie die Abhängigkeiten:

mkdir projeto-agente-react
cd projeto-agente-react
npm init -y
npm install axios dotenv @google/generative-ai

登入後複製

Erstellen Sie eine .env-Datei im Projektstammverzeichnis:

GOOGLE_AI_API_KEY=sua_chave_api_aqui

登入後複製

KOSTENLOSER API-Schlüssel hier

4.2 Rollenerklärung

Diese Datei ist die JavaScript-Datei, die Node.js verwendet, um einen API-Aufruf an Wikipedia durchzuführen.
Wir beschreiben den Inhalt dieser Datei in FunctionDescription.

Erstellen Sie Tools.js mit folgendem Inhalt:

const axios = require("axios");

class Tools {
  static async wikipedia(q) {
    try {
      const response = await axios.get("https://pt.wikipedia.org/w/api.php", {
        params: {
          action: "query",
          list: "search",
          srsearch: q,
          srwhat: "text",
          format: "json",
          srlimit: 4,
        },
      });

      const results = await Promise.all(
        response.data.query.search.map(async (searchResult) => {
          const sectionResponse = await axios.get(
            "https://pt.wikipedia.org/w/api.php",
            {
              params: {
                action: "parse",
                pageid: searchResult.pageid,
                prop: "sections",
                format: "json",
              },
            },
          );

          const sections = Object.values(
            sectionResponse.data.parse.sections,
          ).map((section) => `${section.index}, ${section.line}`);

          return {
            pageTitle: searchResult.title,
            snippet: searchResult.snippet,
            pageId: searchResult.pageid,
            sections: sections,
          };
        }),
      );

      return results
        .map(
          (result) =>
            `Snippet: ${result.snippet}\nPageId: ${result.pageId}\nSections: ${JSON.stringify(result.sections)}`,
        )
        .join("\n\n");
    } catch (error) {
      console.error("Error fetching from Wikipedia:", error);
      return "Error fetching data from Wikipedia";
    }
  }

  static async wikipedia_with_pageId(pageId, sectionId) {
    if (sectionId) {
      const response = await axios.get("https://pt.wikipedia.org/w/api.php", {
        params: {
          action: "parse",
          format: "json",
          pageid: parseInt(pageId),
          prop: "wikitext",
          section: parseInt(sectionId),
          disabletoc: 1,
        },
      });
      return Object.values(response.data.parse?.wikitext ?? {})[0]?.substring(
        0,
        25000,
      );
    } else {
      const response = await axios.get("https://pt.wikipedia.org/w/api.php", {
        params: {
          action: "query",
          pageids: parseInt(pageId),
          prop: "extracts",
          exintro: true,
          explaintext: true,
          format: "json",
        },
      });
      return Object.values(response.data?.query.pages)[0]?.extract;
    }
  }
}

module.exports = Tools;

登入後複製

4.3 Erstellen der ReactAgent.js-Datei

Erstellen Sie ReactAgent.js mit folgendem Inhalt:

require("dotenv").config();
const { GoogleGenerativeAI } = require("@google/generative-ai");
const Tools = require("./Tools");

const genAI = new GoogleGenerativeAI(process.env.GOOGLE_AI_API_KEY);

class ReactAgent {
  constructor(query, functions) {
    this.query = query;
    this.functions = new Set(functions);
    this.state = "THOUGHT";
    this._history = [];
    this.model = genAI.getGenerativeModel({
      model: "gemini-1.5-flash",
      temperature: 1.8,
    });
  }

  async run() {
    this.pushHistory(`**Tarefa: ${this.query} **`);
    try {
      return await this.step();
    } catch (e) {
      console.error("Erro durante a execução:", e);
      return "Desculpe, não consegui processar sua solicitação.";
    }
  }

  async step() {
    const colors = {
      reset: "\x1b[0m",
      yellow: "\x1b[33m",
      red: "\x1b[31m",
      cyan: "\x1b[36m",
    };
    console.log("====================================");
    console.log(
      `Next Movement: ${
        this.state === "THOUGHT"
          ? colors.yellow
          : this.state === "ACTION"
            ? colors.red
            : this.state === "ANSWER"
              ? colors.cyan
              : colors.reset
      }${this.state}${colors.reset}`,
    );
    console.log(`Last Movement: ${this.history[this.history.length - 1]}`);
    console.log("====================================");
    switch (this.state) {
      case "THOUGHT":
        return await this.thought();
        break;
      case "ACTION":
        return await this.action();
        break;
      case "ANSWER":
        return await this.answer();
    }
  }

  async thought() {
    const funcoesDisponiveis = JSON.stringify(Array.from(this.functions));
    const contextoHistorico = this.history.join("\n");
    const prompt = `Sua Tarefa é ${this.consulta}
O Contexto posui todas as reflexões que você fez até agora e os ResultadoAção que coletou.
AçõesDisponíveis são funções que você pode chamar sempre que precisar de mais dados.

Contexto: "${contextoHistorico}" <<

AçõesDisponíveis: "${funcoesDisponiveis}" <<

Tarefa: "${this.consulta}" <<

Reflita sobre Sua Tarefa usando o Contexto, ResultadoAção e AçõesDisponíveis para encontrar seu próximo_passo.
Imprima seu próximo_passo com um Pensamento ou Finalize Cumprindo Sua Tarefa caso tenha as informações disponíveis`;

    const thought = await this.promptModel(prompt);
    this.pushHistory(`\n **${thought.trim()}**`);

    if (
      thought.toLowerCase().includes("cumprida") ||
      thought.toLowerCase().includes("cumpra") ||
      thought.toLowerCase().includes("cumprindo") ||
      thought.toLowerCase().includes("finalizar") ||
      thought.toLowerCase().includes("finalizando") ||
      thought.toLowerCase().includes("finalize") ||
      thought.toLowerCase().includes("concluída")
    ) {
      this.state = "ANSWER";
    } else {
      this.state = "ACTION";
    }
    return this.step();
  }

  async action() {
    const action = await this.decideAction();
    this.pushHistory(`** Ação: ${action} **`);
    const result = await this.executeFunctionCall(action);
    this.pushHistory(`** ResultadoAção: ${result} **`);
    this.state = "THOUGHT";
    return this.step();
  }

  async decideAction() {
    const availableFunctions = JSON.stringify(Array.from(this.functions));
    const historyContext = this.history;
    const prompt = `Reflita sobre o Pensamento, Consulta e Ações Disponíveis

    ${historyContext[historyContext.length - 2]}

    Pensamento <<< ${historyContext[historyContext.length - 1]}

    Consulta: "${this.query}"

    Ações Disponíveis: ${availableFunctions}

    Retorne apenas a função,parâmetros separados por vírgula. Exemplo: "wikipedia,ronaldinho gaucho,1450"`;

    const decision = await this.promptModel(prompt);
    return decision.replace(/`/g, "").trim();
  }

  async answer() {
    const historyContext = this.history.join("\n");
    const prompt = `Com base no seguinte contexto, forneça uma resposta completa e detalhada para a tarefa: ${this.query}.

    Contexto:
    ${historyContext}

    Tarefa: "${this.query}"`;

    const finalAnswer = await this.promptModel(prompt);
    return finalAnswer;
  }

  async promptModel(prompt) {
    const result = await this.model.generateContent(prompt);
    const response = await result.response;
    return response.text();
  }

  async executeFunctionCall(functionCall) {
    const [functionName, ...args] = functionCall.split(",");
    const func = Tools[functionName.trim()];
    if (func) {
      return await func.call(null, ...args);
    }
    throw new Error(`Função ${functionName} não encontrada`);
  }

  pushHistory(value) {
    this._history.push(value);
  }

  get history() {
    return this._history;
  }
}

module.exports = ReactAgent;

登入後複製

4.4 Ausführen des Agenten und Erläutern der verfügbaren Tools (index.js)

Erstellen Sie index.js mit folgendem Inhalt:

const ReactAgent = require("./ReactAgentPTBR.js");

async function main() {
  const query = "Que clubes ronaldinho gaúcho jogou para?";
  // const query = "Quais os bairros de Joinville?";
  // const query = "Qual a capital da frança?";

  const functions = [
    [
      "wikipedia",
      "params: query",
      "Busca semântica na Wikipedia API por pageId e sectionIds >> \n ex: Pontos turísticos de são paulo \n São Paulo é uma cidade com muitos pontos turísticos, pageId, sections : []",
    ],
    [
      "wikipedia_with_pageId",
      "params: pageId, sectionId",
      "Busca na Wikipedia API usando pageId e sectionIndex como parametros. \n ex: 1500,1234 \n Informações sobre a seção blablalbal",
    ],
  ];

  const agent = new ReactAgent(query, functions);
  const result = await agent.run();
  console.log("Resultado do Agente:", result);
}

main().catch(console.error);

登入後複製

Rollenbeschreibung

Wenn Sie versuchen, ein neues Werkzeug oder eine neue Funktion hinzuzufügen, stellen Sie sicher, dass Sie diese gut beschreiben.
In unserem Beispiel ist dies bereits erledigt und beim Aufruf einer neuen Instanz zu unserer ReActAgent-Klasse hinzugefügt.

const functions = [
    [
        "google", // nomeDaFuncao
        "params: query", // NomeDoParâmetroLocal
        "Pesquisa semântica na API da Wikipedia por snippets, pageIds e sectionIds >> \n ex: Quando o Brasil foi colonizado? \n O Brasil foi colonizado em 1500, pageId, sections : []", // breve explicação e exemplo (isso será encaminhado para o LLM)
    ]
];

登入後複製

[5] So funktioniert der Wikipedia-Teil

Die Interaktion mit Wikipedia erfolgt in zwei Hauptschritten:

Erste Suche (Wikipedia-Funktion):
- Stellt eine Anfrage an die Wikipedia-Such-API.
- Gibt bis zu 4 für die Abfrage relevante Ergebnisse zurück.
- Durchsuchen Sie für jedes Ergebnis die Abschnitte der Seite.
Detaillierte Suche (wikipedia_with_pageId-Funktion):
- Verwendet Seiten-ID und Abschnitts-ID, um nach bestimmten Inhalten zu suchen.
- Gibt den Text des angeforderten Abschnitts zurück.

Dieser Prozess ermöglicht es dem Agenten, sich zunächst einen Überblick über Themen im Zusammenhang mit der Abfrage zu verschaffen und dann bei Bedarf einen Drilldown in bestimmte Abschnitte durchzuführen.