想像您正在建立一個客戶支持AI,需要回答有關您的產品的問題。有時,它需要從您的文檔中獲取信息,而其他時間則需要搜索網絡以獲取最新更新。代理抹布系統在此類複雜的AI應用程序中派上用場。將他們視為聰明的研究助理,他們不僅知道您的內部文檔,而且決定何時搜索網絡。在本指南中,我們將使用Haystack Framework進行構建代理QA抹布系統的過程。
> > data Science Blogathon的一部分。 目錄的>>什麼是代理llm?塊
>傳統的抹布系統遵循線性過程。 收到查詢時,系統首先標識請求中的密鑰元素。然後,它搜索知識庫,掃描相關信息,以幫助設計準確的響應。一旦檢索了相關信息或數據,系統就會對其進行處理以生成有意義且具有上下文相關的響應。
您可以通過下圖輕鬆理解這些過程。
現在,一個代理抹布系統通過以下方式增強了此過程
評估查詢要求
您可以使用Haystack構建什麼?
>使用強大的檢索和發電技術易於促進數據的抹布。
>
在混合類型(圖像,文本,音頻和表)知識庫上生成多模式提問系統。
>組件是Haystack的核心構建塊。他們可以執行諸如文檔存儲,文檔檢索,文本生成和嵌入之類的任務。 Haystack有許多組件,您可以在安裝後直接使用,它還提供了通過編寫Python類製造自己組件的API。
有合作夥伴公司和社區的集成集合。>
>安裝庫,並設置Ollama>
$ pip install haystack-ai ollama-haystack # On you system download Ollama and install LLM ollama pull llama3.2:3b ollama pull nomic-embed-text # And then start ollama server ollama serve
創建文檔和文檔存儲
from haystack import Document, Pipeline from haystack.components.builders.prompt_builder import PromptBuilder from haystack.components.retrievers.in_memory import InMemoryBM25Retriever from haystack.document_stores.in_memory import InMemoryDocumentStore from haystack_integrations.components.generators.ollama import OllamaGenerator
管道
document_store = InMemoryDocumentStore() documents = [ Document( content="Naruto Uzumaki is a ninja from the Hidden Leaf Village and aspires to become Hokage." ), Document( content="Luffy is the captain of the Straw Hat Pirates and dreams of finding the One Piece." ), Document( content="Goku, a Saiyan warrior, has defended Earth from numerous powerful enemies like Frieza and Cell." ), Document( content="Light Yagami finds a mysterious Death Note, which allows him to eliminate people by writing their names." ), Document( content="Levi Ackerman is humanity’s strongest soldier, fighting against the Titans to protect mankind." ), ]
>定義管道
您可以可視化管道
pipe = Pipeline() pipe.add_component("retriever", InMemoryBM25Retriever(document_store=document_store)) pipe.add_component("prompt_builder", PromptBuilder(template=template)) pipe.add_component( "llm", OllamaGenerator(model="llama3.2:1b", url="http://localhost:11434") ) pipe.connect("retriever", "prompt_builder.documents") pipe.connect("prompt_builder", "llm")
管道提供:
image_param = { "format": "img", "type": "png", "theme": "forest", "bgColor": "f2f3f4", } pipe.show(params=image_param)
模塊化工作流程管理
連接圖定義了組件如何相互作用。
從上面的管道中,您可以可視化連接圖。
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=document_store)) pipe.add_component("prompt_builder", PromptBuilder(template=template)) pipe.add_component( "llm", OllamaGenerator(model="llama3.2:1b", url="http://localhost:11434") )
這個圖形結構:
image_param = { "format": "img", "type": "png", "theme": "forest", "bgColor": "f2f3f4", } pipe.show(params=image_param)
定義組件之間的數據流
>
>管理輸入/輸出關係
>
創建靈活的處理途徑。
響應:
template = """ Given only the following information, answer the question. Ignore your own knowledge. Context: {% for document in documents %} {{ document.content }} {% endfor %} Question: {{ query }}? """
這個抹布對新來者來說很簡單,但在概念上很有價值。現在,我們已經了解了大多數Haystack框架的概念,我們可以深入研究我們的主要項目。如果有什麼新事物出現,我將在此過程中解釋。
>
local llama3.2:3b或llama3.2:1b
我們將設置一個conda env python 3.12
安裝必要的軟件包
$ pip install haystack-ai ollama-haystack # On you system download Ollama and install LLM ollama pull llama3.2:3b ollama pull nomic-embed-text # And then start ollama server ollama serve
from haystack import Document, Pipeline from haystack.components.builders.prompt_builder import PromptBuilder from haystack.components.retrievers.in_memory import InMemoryBM25Retriever from haystack.document_stores.in_memory import InMemoryDocumentStore from haystack_integrations.components.generators.ollama import OllamaGenerator
。 >您可以為項目或jupyter筆記本使用普通的Python文件,這無關緊要。我將使用一個普通的python文件。
document_store = InMemoryDocumentStore() documents = [ Document( content="Naruto Uzumaki is a ninja from the Hidden Leaf Village and aspires to become Hokage." ), Document( content="Luffy is the captain of the Straw Hat Pirates and dreams of finding the One Piece." ), Document( content="Goku, a Saiyan warrior, has defended Earth from numerous powerful enemies like Frieza and Cell." ), Document( content="Light Yagami finds a mysterious Death Note, which allows him to eliminate people by writing their names." ), Document( content="Levi Ackerman is humanity’s strongest soldier, fighting against the Titans to protect mankind." ), ]
main.py
文件。> 導入必要的庫
pipe = Pipeline() pipe.add_component("retriever", InMemoryBM25Retriever(document_store=document_store)) pipe.add_component("prompt_builder", PromptBuilder(template=template)) pipe.add_component( "llm", OllamaGenerator(model="llama3.2:1b", url="http://localhost:11434") ) pipe.connect("retriever", "prompt_builder.documents") pipe.connect("prompt_builder", "llm")
image_param = { "format": "img", "type": "png", "theme": "forest", "bgColor": "f2f3f4", } pipe.show(params=image_param)
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=document_store)) pipe.add_component("prompt_builder", PromptBuilder(template=template)) pipe.add_component( "llm", OllamaGenerator(model="llama3.2:1b", url="http://localhost:11434") )
image_param = { "format": "img", "type": "png", "theme": "forest", "bgColor": "f2f3f4", } pipe.show(params=image_param)
template = """ Given only the following information, answer the question. Ignore your own knowledge. Context: {% for document in documents %} {{ document.content }} {% endfor %} Question: {{ query }}? """
該解決方案是矢量數據庫,例如Pinecode,Weaviate,Postgres Vector DB或Chromadb。我之所以使用chromadb,是因為免費,開源,易於使用且健壯。 >
persist_pathquery = "How Goku eliminate people?" response = pipe.run({"prompt_builder": {"query": query}, "retriever": {"query": query}}) print(response["llm"]["replies"])
pdf文件路徑
>它將從數據文件夾中創建文件列表,該文件由我們的PDF文件組成。
$conda create --name agenticlm python=3.12 $conda activate agenticlm
>我們將使用Haystack的內置文檔預處理器,例如清潔器,分離器和文件轉換器,然後使用Writer將數據寫入商店。
分離器:>它將以各種方式分開文檔,例如單詞,句子,para,pages。
>$pip install haystack-ai ollama-haystack pypdf $pip install chroma-haystack duckduckgo-api-haystack
>>文件轉換器:它將使用PYPDF將PDF轉換為文檔。 作者:>它將存儲文檔要存儲文檔和重複文檔的文檔,它將與先前的文檔覆蓋。
嵌入:nomic嵌入文本>
>
在運行索引管道之前,請打開終端並在下面鍵入sumic-embed-text和llama3.2:3b模型從Ollama模型商店
來啟動Ollama
現在嵌入組件
我們使用 ollamatextemtembedder。
創建索引管道
就像我們以前的玩具抹布示例一樣,我們將首先啟動管道類別。 >
將組件連接到管道圖 >在這裡,訂單很重要,因為如何連接組件告訴管道數據將如何流過管道。就像,在哪個順序或購買水管物品的位置都沒關係,但是如何將它們放在一起會決定您是否獲得水。 繪製索引管道
實現路由器
系統從嵌入式商店上下文中獲得NO_ANSWER回复時,它將轉到Web搜索工具以從Internet收集相關數據。 >用於網絡搜索,我們將使用DuckDuckgo API或Tavely,在這裡我使用了DuckDuckgo。
創建提示模板 首先,我們將為QA
>它將有助於系統進入Web搜索並嘗試回答查詢。 的提示構建器創建提示
我們將使用HayStack提示Joiner一起加入提示的分支。
>實現查詢管道 它類似於索引管道。 >在查詢管道中添加組件
在這裡,對於LLM生成,我們使用ollamagenerator組件使用Llama3.2:3b或1b或您喜歡使用工具調用的任何LLM生成答案。
獵犬將數據發送到提示_builder的文檔。
>
no_answer 我知道這是一個巨大的圖表,但它會向您顯示野獸腹部下發生的事情。 現在是時候享受我們辛勤工作的果實了。
>
現在運行您的主要腳本以索引NCERT物理書
和文件的底部,我們為查詢
關於本書知識的電阻率的 書中不在書中的另一個問題
輸出
>
>我們的代理抹布系統展示了Haystack框架與組合組件和管道的功能的靈活性和魯棒性。可以通過部署到Web服務平台,並使用付費更好的LLM(例如OpenAI和Nththththopic)來準備生產。您可以使用簡化或基於React的Web Spa構建UI,以獲得更好的用戶體驗。
代理抹布系統提供了更聰明,更靈活的響應。
Haystack的管道體系結構啟用複雜的模塊化工作流程。 Q4。我可以使用其他LLM API?是的,很容易只為各自的LLM API安裝必要的集成包,例如Gemini,Anthropic和Groq,然後將其與API鍵一起使用。 $ pip install haystack-ai ollama-haystack
# On you system download Ollama and install LLM
ollama pull llama3.2:3b
ollama pull nomic-embed-text
# And then start ollama server
ollama serve
from haystack import Document, Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.generators.ollama import OllamaGenerator
document_store = InMemoryDocumentStore()
documents = [
Document(
content="Naruto Uzumaki is a ninja from the Hidden Leaf Village and aspires to become Hokage."
),
Document(
content="Luffy is the captain of the Straw Hat Pirates and dreams of finding the One Piece."
),
Document(
content="Goku, a Saiyan warrior, has defended Earth from numerous powerful enemies like Frieza and Cell."
),
Document(
content="Light Yagami finds a mysterious Death Note, which allows him to eliminate people by writing their names."
),
Document(
content="Levi Ackerman is humanity’s strongest soldier, fighting against the Titans to protect mankind."
),
]
pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component(
"llm", OllamaGenerator(model="llama3.2:1b", url="http://localhost:11434")
)
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")
現在,我們將一一
一個添加到管道中的組件
image_param = {
"format": "img",
"type": "png",
"theme": "forest",
"bgColor": "f2f3f4",
}
pipe.show(params=image_param)
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component(
"llm", OllamaGenerator(model="llama3.2:1b", url="http://localhost:11434")
)
>轉換器將PDF轉換並發送清潔以進行清潔。然後,清潔工將清潔的文檔發送到分離器以進行分解。然後,這些塊將傳遞到嵌入式矢量化,最後嵌入的嵌入將把這些嵌入到作者的存儲中。
。
image_param = {
"format": "img",
"type": "png",
"theme": "forest",
"bgColor": "f2f3f4",
}
pipe.show(params=image_param)
template = """
Given only the following information, answer the question.
Ignore your own knowledge.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{ query }}?
"""
$ pip install haystack-ai ollama-haystack
# On you system download Ollama and install LLM
ollama pull llama3.2:3b
ollama pull nomic-embed-text
# And then start ollama server
ollama serve
from haystack import Document, Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.generators.ollama import OllamaGenerator
我們將使用Haystack提示式佈置組件來從模板
構建提示
document_store = InMemoryDocumentStore()
documents = [
Document(
content="Naruto Uzumaki is a ninja from the Hidden Leaf Village and aspires to become Hokage."
),
Document(
content="Luffy is the captain of the Straw Hat Pirates and dreams of finding the One Piece."
),
Document(
content="Goku, a Saiyan warrior, has defended Earth from numerous powerful enemies like Frieza and Cell."
),
Document(
content="Light Yagami finds a mysterious Death Note, which allows him to eliminate people by writing their names."
),
Document(
content="Levi Ackerman is humanity’s strongest soldier, fighting against the Titans to protect mankind."
),
]
pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component(
"llm", OllamaGenerator(model="llama3.2:1b", url="http://localhost:11434")
)
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")
image_param = {
"format": "img",
"type": "png",
"theme": "forest",
"bgColor": "f2f3f4",
}
pipe.show(params=image_param)
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component(
"llm", OllamaGenerator(model="llama3.2:1b", url="http://localhost:11434")
)
image_param = {
"format": "img",
"type": "png",
"theme": "forest",
"bgColor": "f2f3f4",
}
pipe.show(params=image_param)
template = """
Given only the following information, answer the question.
Ignore your own knowledge.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{ query }}?
"""
query = "How Goku eliminate people?"
response = pipe.run({"prompt_builder": {"query": query}, "retriever": {"query": query}})
print(response["llm"]["replies"])
Web搜索將數據發送到Web搜索提示
$ pip install haystack-ai ollama-haystack
# On you system download Ollama and install LLM
ollama pull llama3.2:3b
ollama pull nomic-embed-text
# And then start ollama server
ollama serve
from haystack import Document, Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.generators.ollama import OllamaGenerator
document_store = InMemoryDocumentStore()
documents = [
Document(
content="Naruto Uzumaki is a ninja from the Hidden Leaf Village and aspires to become Hokage."
),
Document(
content="Luffy is the captain of the Straw Hat Pirates and dreams of finding the One Piece."
),
Document(
content="Goku, a Saiyan warrior, has defended Earth from numerous powerful enemies like Frieza and Cell."
),
Document(
content="Light Yagami finds a mysterious Death Note, which allows him to eliminate people by writing their names."
),
Document(
content="Levi Ackerman is humanity’s strongest soldier, fighting against the Titans to protect mankind."
),
]
pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component(
"llm", OllamaGenerator(model="llama3.2:1b", url="http://localhost:11434")
)
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")
image_param = {
"format": "img",
"type": "png",
"theme": "forest",
"bgColor": "f2f3f4",
}
pipe.show(params=image_param)
結論
>
鑰匙要點>
連接圖提供了靈活且可維護的組件交互。 >
以上是如何使用Haystack Framework構建代理QA抹布系統的詳細內容。更多資訊請關注PHP中文網其他相關文章!