開始使用OpenAI結構化輸出-人工智慧-PHP中文網

Getting Started With OpenAI Structured Outputs

在2024年8月，Openai宣布了其API的強大新功能 - 結構化輸出。顧名思義，使用此功能，您可以確保LLM僅以指定的格式生成響應。此功能將使需要精確數據格式的應用程序變得更加容易。

在本教程中，您將學習如何從OpenAI結構化輸出開始，了解其新的語法並探索其關鍵應用程序。

在AI應用程序中結構化輸出的重要性

確定性響應，換句話說，以一致格式的響應對於許多任務，例如數據輸入，信息檢索，問答，多步工作流等等至關重要。您可能已經體驗了LLMS如何以截然不同的格式生成輸出，即使提示是相同的。

例如，考慮由GPT-4O驅動的此簡單的分類函數：

# List of hotel reviews
reviews = [
   "The room was clean and the staff was friendly.",
   "The location was terrible and the service was slow.",
   "The food was amazing but the room was too small.",
]
# Classify sentiment for each review and print the results
for review in reviews:
   sentiment = classify_sentiment(review)
   print(f"Review: {review}\nSentiment: {sentiment}\n")

登入後複製

>輸出：

Review: The room was clean and the staff was friendly.
Sentiment: Positive
Review: The location was terrible and the service was slow.
Sentiment: Negative
Review: The food was amazing but the room was too small.
Sentiment: The sentiment of the review is neutral.

登入後複製

即使前兩個響應是相同的單字格式，最後一個是整個句子。如果其他一些下游應用程序取決於上述代碼的輸出，則它將崩潰，因為它會期望單詞響應。

>我們可以通過一些及時的工程來解決此問題，但這是一個耗時的迭代過程。即使有了完美的提示，我們也不能100％確定響應將在以後的請求中符合我們的格式。當然，除非我們使用結構化的輸出：

>輸出：

def classify_sentiment_with_structured_outputs(review):
   """Sentiment classifier with Structured Outputs"""
   ...
# Classify sentiment for each review with Structured Outputs
for review in reviews:
   sentiment = classify_sentiment_with_structured_outputs(review)
   print(f"Review: {review}\nSentiment: {sentiment}\n")

登入後複製

使用新函數，classify_sentiment_with_structured_outputs，響應都以相同的格式。

Review: The room was clean and the staff was friendly.
Sentiment: {"sentiment":"positive"}
Review: The location was terrible and the service was slow.
Sentiment: {"sentiment":"negative"}
Review: The food was amazing but the room was too small.
Sentiment: {"sentiment":"neutral"}

登入後複製

>以剛性格式強迫語言模型的能力非常重要，可以為您節省無數小時的及時工程或依賴其他開源工具。 >

>從OpenAI結構化輸出開始

在本節中，我們將使用情感分析儀函數的示例分解結構化輸出。

設置您的環境

>先決條件

在開始之前，請確保您有以下內容：>

python 3.7或以後安裝在您的系統上。

>

> OpenAI API鍵。您可以通過在OpenAI網站上註冊來獲得此功能。

2。設置API密鑰：您可以將API密鑰設置為環境變量或直接在代碼中。要將其設置為環境變量，請運行：

>

3。驗證安裝：創建一個簡單的python腳本以驗證安裝：>

# List of hotel reviews
reviews = [
   "The room was clean and the staff was friendly.",
   "The location was terrible and the service was slow.",
   "The food was amazing but the room was too small.",
]
# Classify sentiment for each review and print the results
for review in reviews:
   sentiment = classify_sentiment(review)
   print(f"Review: {review}\nSentiment: {sentiment}\n")

登入後複製

>運行腳本以確保正確設置所有內容。您應該在終端中看到模型的響應。

> 除了OpenAi軟件包外，您還需要Pydantic庫來定義和驗證JSON模式的結構化輸出。使用PIP安裝它：

Review: The room was clean and the staff was friendly.
Sentiment: Positive
Review: The location was terrible and the service was slow.
Sentiment: Negative
Review: The food was amazing but the room was too small.
Sentiment: The sentiment of the review is neutral.

登入後複製

>通過這些步驟，您的環境現在可以使用OpenAI的結構化輸出功能。 >

使用pydantic

定義輸出模式

>要使用結構化輸出，您需要使用Pydantic模型來定義預期的輸出結構。 Pydantic是Python的數據驗證和設置管理庫，它允許您使用Python型註釋來定義數據模型。然後可以使用這些模型來強制執行OpenAI模型生成的輸出的結構。

這是一個示例pydantic模型，用於指定我們的評論情感分類器的格式：

在此示例中

>：

def classify_sentiment_with_structured_outputs(review):
   """Sentiment classifier with Structured Outputs"""
   ...
# Classify sentiment for each review with Structured Outputs
for review in reviews:
   sentiment = classify_sentiment_with_structured_outputs(review)
   print(f"Review: {review}\nSentiment: {sentiment}\n")

登入後複製

sentimentResponse是一個pydantic模型，它定義了輸出的預期結構。

模型具有單一字段情緒，只能採用三個字面價值之一：“正面”，“負”或“中性”。

當我們作為OpenAI API請求的一部分傳遞此模型時，輸出將只是我們提供的單詞之一。

在OpenAI請求中強制執行我們的Pydantic模式，我們要做的就是將其傳遞給聊天完成API的響應_format參數。粗略地，這是它的樣子：

如果您注意到，而不是使用client.chat.completions.create，我們使用的是client.beta.chat.completions.parse方法。 .parse（）是專門為結構化輸出編寫的聊天完成API中的一種新方法。

現在，讓我們通過重寫帶有結構化輸出的評論情感分類器來將所有內容整合在一起。首先，我們進行必要的導入，定義pydantic模型，系統提示和提示模板：

>

然後，我們編寫了一個使用.parse（）助手方法的新功能：>

函數中的重要行是response_format = entimentResponse，這實際上是啟用結構化輸出的方法。

Review: The room was clean and the staff was friendly.
Sentiment: {"sentiment":"positive"}
Review: The location was terrible and the service was slow.
Sentiment: {"sentiment":"negative"}
Review: The food was amazing but the room was too small.
Sentiment: {"sentiment":"neutral"}

登入後複製

讓我們對其中一項評論進行測試：

在這裡，結果是一個消息對象：>

除了檢索響應的.content屬性外，它具有.parsed屬性，該屬性將解析的信息返回為類：

$ pip install -U openai

登入後複製

如您所見，我們有一個sentermentresponse類的實例。這意味著我們可以使用.sentiment屬性以字符串而不是字典訪問情感：>

# List of hotel reviews
reviews = [
   "The room was clean and the staff was friendly.",
   "The location was terrible and the service was slow.",
   "The food was amazing but the room was too small.",
]
# Classify sentiment for each review and print the results
for review in reviews:
   sentiment = classify_sentiment(review)
   print(f"Review: {review}\nSentiment: {sentiment}\n")

登入後複製

嵌套pydantic模型，用於定義復雜模式

在某些情況下，您可能需要定義涉及嵌套數據的更複雜的輸出結構。 Pydantic允許您相互嵌套模型，使您能夠創建可以處理各種用例的複雜模式。在處理層次數據時，或者需要為複雜輸出執行特定結構時，這特別有用。

>讓我們考慮一個示例，我們需要在其中提取詳細的用戶信息，包括其姓名，聯繫方式和地址列表。每個地址都應包括街道，城市，州和郵政編碼的字段。這需要一個以上的pydantic模型來構建正確的模式。

>步驟1：定義Pydantic模型

首先，我們為地址和用戶信息定義了pydantic模型：>

在此示例中

>：

Review: The room was clean and the staff was friendly.
Sentiment: Positive
Review: The location was terrible and the service was slow.
Sentiment: Negative
Review: The food was amazing but the room was too small.
Sentiment: The sentiment of the review is neutral.

登入後複製

地址是一個定義地址結構的pydantic模型。 >

>步驟2：在API調用中使用嵌套的pydantic模型

接下來，我們使用這些嵌套的pydantic模型來在OpenAI API調用中強制執行輸出結構：

示例文本完全不可讀，並且在關鍵信息之間缺少空間。讓我們看看該模型是否成功。我們將使用JSON庫來使響應很好：

def classify_sentiment_with_structured_outputs(review):
   """Sentiment classifier with Structured Outputs"""
   ...
# Classify sentiment for each review with Structured Outputs
for review in reviews:
   sentiment = classify_sentiment_with_structured_outputs(review)
   print(f"Review: {review}\nSentiment: {sentiment}\n")

登入後複製

如您所見，該模型根據我們提供的架構正確捕獲了單個用戶的信息以及他們的兩個單獨的地址。

簡而言之，通過嵌套pydantic模型，您可以定義處理層次數據並為複雜輸出執行特定結構的複雜模式。

函數用結構化輸出調用

Review: The room was clean and the staff was friendly.
Sentiment: {"sentiment":"positive"}
Review: The location was terrible and the service was slow.
Sentiment: {"sentiment":"negative"}
Review: The food was amazing but the room was too small.
Sentiment: {"sentiment":"neutral"}

登入後複製

>新語言模型的廣泛特徵之一是函數調用（也稱為工具調用）。此功能使您可以將語言模型連接到用戶定義的功能，從而有效地（模型）訪問外部世界。

一些常見的示例是：

>檢索實時數據（例如，天氣，股價，運動得分）

執行計算或數據分析

查詢數據庫或API

生成圖像或其他媒體

在語言之間翻譯文本
控制智能家居設備或物聯網系統
>執行自定義業務邏輯或工作流程

>重要的是，使用結構化輸出，使用OpenAI模型使用函數調用變得更加容易。過去，您將傳遞給OpenAI模型的功能將需要編寫複雜的JSON模式，並用類型提示概述每個功能參數。這是一個示例：

# List of hotel reviews
reviews = [
   "The room was clean and the staff was friendly.",
   "The location was terrible and the service was slow.",
   "The food was amazing but the room was too small.",
]
# Classify sentiment for each review and print the results
for review in reviews:
   sentiment = classify_sentiment(review)
   print(f"Review: {review}\nSentiment: {sentiment}\n")

登入後複製

>即使get_current_weather函數具有兩個參數，其JSON模式也變得巨大且容易出錯。

>通過再次使用Pydantic模型在結構化輸出中解決：>

首先，您編寫功能本身及其邏輯。然後，您可以使用指定預期輸入參數的pydantic模型再次定義它。

Review: The room was clean and the staff was friendly.
Sentiment: Positive
Review: The location was terrible and the service was slow.
Sentiment: Negative
Review: The food was amazing but the room was too small.
Sentiment: The sentiment of the review is neutral.

登入後複製

然後，要將pydantic模型轉換為兼容的JSON模式，您可以致電Pydantic_function_tool：

這是如何將此工具作為請求的一部分使用：的一部分

def classify_sentiment_with_structured_outputs(review):
   """Sentiment classifier with Structured Outputs"""
   ...
# Classify sentiment for each review with Structured Outputs
for review in reviews:
   sentiment = classify_sentiment_with_structured_outputs(review)
   print(f"Review: {review}\nSentiment: {sentiment}\n")

登入後複製

>我們以兼容的JSON格式將Pydantic模型傳遞給聊天完成API的工具參數。然後，根據我們的查詢，該模型決定是否調用該工具。

>由於上面的查詢是“東京的天氣是什麼？”，我們在返回消息對象的tool_calls中看到了一個電話。

Review: The room was clean and the staff was friendly.
Sentiment: {"sentiment":"positive"}
Review: The location was terrible and the service was slow.
Sentiment: {"sentiment":"negative"}
Review: The food was amazing but the room was too small.
Sentiment: {"sentiment":"neutral"}

登入後複製

記住，該模型未調用get_weather函數，而是根據我們提供的Pydantic模式生成參數：

>由我們通過提供的參數調用該函數：>

如果您希望該模型生成該功能的參數並同時調用它，則您正在尋找AI代理。