人工智能革命席卷应用开发领域,开启了人机交互的新纪元。企业利用 AI 提升用户体验的同时,基于大型语言模型 (LLM) 的解决方案也带来了维护内容完整性、准确性和道德标准的挑战。
随着应用扩展到受控环境之外,对负责任的 AI 审核的需求变得越来越明显。在这些环境中,确保对用户提供合理和准确的回应并非易事,但却至关重要。
例如,在客户服务互动中,错误信息或不当内容可能导致客户不满,甚至损害企业的声誉。但作为开发者,如何确保基于 AI 的应用能够向用户提供合理和准确的回应?这就是 AI 审核发挥作用的地方!
本文将深入探讨一种使用 GPT 模型审核基于 GPT 的应用的技术。
AI 质量审核还包括在使用大型语言模型 (LLM) 时确保生成无偏见和适当的响应。OpenAI 已经推出了一个专为这类审核需求设计的 API。如果您热衷于检测模型产生的有偏见或不当的回应,或解决用户的不当行为,那么您会在题为《ChatGPT 审核 API:输入/输出控制》的文章中找到有价值的见解。
然而,本文采用了一种不同的 AI 审核方法。我们关注的是保证模型响应的质量,即准确性和满足用户需求。据我所知,目前还没有专门为此目的设计的官方端点。
尽管如此,鉴于我们在各种应用中广泛使用 GPT 模型,为什么不将它们用作同一模型实例的质量检查器呢?
我们可以利用 GPT 模型来评估模型本身针对用户请求生成的输出。这种测试方法有助于防止含糊不清和错误的回应,并增强模型有效满足用户请求的能力。
本文介绍了如何在应用范围内使用 GPT 模型来审核基于 GPT 的应用的质量和正确性。
例如,如果您使用 GPT 模型为您的企业聊天机器人提供动力,那么您肯定非常希望确保您的聊天机器人不会提供任何超出您的产品目录项目或特性的信息。
在接下来的章节中,我们将通过使用 openai Python 包和著名的 Jupyter Notebook 对 OpenAI API 进行简单的调用,使最后一个例子生动起来。
主要目标是生成一个简单的基于 LLM 的应用,并使用基于 LLM 的质量检查器来审核其输出。在我们的示例中,我们需要创建我们的示例客户服务代理、质量检查代理(从现在起称为 QA 代理),更重要的是,定义两者之间的交互。
下图很好地展现了上述工作流程:
自制图片。审核工作流程图:1. 用户向基于 LLM 的应用(本例中为客户服务聊天机器人)发送请求。2. 聊天机器人生成答案,但首先将其发送给 QA 代理。3. QA 代理在检查答案是否合适后,将答案发送回用户。
让我们一步一步来!
让我们从为商店的客户服务构建一个对话代理开始。
如果您已经拥有一个可运行的 LLM 驱动的应用程序或要实现您首选的示例,请随时跳过第一部分!如果您仍然想知道您的企业是否可以从基于 LLM 的应用程序中受益,那么您应该关注一个有趣的播客讨论!
让我们假设我们正在为我们的商店构建一个客户服务代理。我们有兴趣在该客户代理背后使用 ChatGPT 等模型,以利用其自然语言能力来理解用户查询并以自然的方式回复他们。
为了定义我们的客户服务聊天机器人,我们需要两个关键要素:
最后,与任何其他基于 LLM 的应用程序一样,我们需要一种方法可以从我们的脚本中调用 OpenAI API。在本文中,我将使用以下实现,它仅依赖于 openai 包:
<code>import openai import os # 从环境中获取 OpenAI 密钥 openai_api_key = os.environ["OPENAI_API_KEY"] # 使用过去交互记忆的简单 OpenAI API 调用 def gpt_call(prompt, message_history, model="gpt-3.5-turbo"): message_history.append({'role': 'user', 'content': prompt}) response = openai.ChatCompletion.create( model=model, messages=message_history ) response_text = response.choices[0].message["content"] message_history.append({'role': 'assistant', 'content': response_text}) return response_text</code>
其背后的思想是为每个模型实例初始化一个单独的消息历史记录(包含系统消息),并使用即将进行的与模型的交互来不断更新它。
如果您正在寻找一种更优化的处理交互方式,我强烈建议您使用 langchain 框架,就像我们在《构建上下文感知聊天机器人:利用 LangChain 框架实现 ChatGPT》中所做的那样。
如果您不熟悉 OpenAI API,请考虑查看关于《OpenAI API 和 ChatGPT 入门》的网络研讨会。
现在我们已经确定了所需的构建块,让我们将它们组合在一起:
<code># 定义我们的示例产品目录 product_information = """ { "name": "UltraView QLED TV", "category": "Televisions and Home Theater Systems", "brand": "UltraView", "model_number": "UV-QLED65", "warranty": "3 years", "rating": 4.9, "features": [ "65-inch QLED display", "8K resolution", "Quantum HDR", "Dolby Vision", "Smart TV" ], "description": "Experience lifelike colors and incredible clarity with this high-end QLED TV.", "price": 2499.99 } { "name": "ViewTech Android TV", "category": "Televisions and Home Theater Systems", "brand": "ViewTech", "model_number": "VT-ATV55", "warranty": "2 years", "rating": 4.7, "features": [ "55-inch 4K display", "Android TV OS", "Voice remote", "Chromecast built-in" ], "description": "Access your favorite apps and content on this smart Android TV.", "price": 799.99 } { "name": "SlimView OLED TV", "category": "Televisions and Home Theater Systems", "brand": "SlimView", "model_number": "SL-OLED75", "warranty": "2 years", "rating": 4.8, "features": [ "75-inch OLED display", "4K resolution", "HDR10+", "Dolby Atmos", "Smart TV" ], "description": "Immerse yourself in a theater-like experience with this ultra-thin OLED TV.", "price": 3499.99 } { "name": "TechGen X Pro", "category": "Smartphones and Accessories", "brand": "TechGen", "model_number": "TG-XP20", "warranty": "1 year", "rating": 4.5, "features": [ "6.4-inch AMOLED display", "128GB storage", "48MP triple camera", "5G", "Fast charging" ], "description": "A feature-packed smartphone designed for power users and mobile enthusiasts.", "price": 899.99 } { "name": "GigaPhone 12X", "category": "Smartphones and Accessories", "brand": "GigaPhone", "model_number": "GP-12X", "warranty": "2 years", "rating": 4.6, "features": [ "6.7-inch IPS display", "256GB storage", "108MP quad camera", "5G", "Wireless charging" ], "description": "Unleash the power of 5G and high-resolution photography with the GigaPhone 12X.", "price": 1199.99 } { "name": "Zephyr Z1", "category": "Smartphones and Accessories", "brand": "Zephyr", "model_number": "ZP-Z1", "warranty": "1 year", "rating": 4.4, "features": [ "6.2-inch LCD display", "64GB storage", "16MP dual camera", "4G LTE", "Long battery life" ], "description": "A budget-friendly smartphone with reliable performance for everyday use.", "price": 349.99 } { "name": "PixelMaster Pro DSLR", "category": "Cameras and Camcorders", "brand": "PixelMaster", "model_number": "PM-DSLR500", "warranty": "2 years", "rating": 4.8, "features": [ "30.4MP full-frame sensor", "4K video", "Dual Pixel AF", "3.2-inch touchscreen" ], "description": "Unleash your creativity with this professional-grade DSLR camera.", "price": 1999.99 } { "name": "ActionX Waterproof Camera", "category": "Cameras and Camcorders", "brand": "ActionX", "model_number": "AX-WPC100", "warranty": "1 year", "rating": 4.6, "features": [ "20MP sensor", "4K video", "Waterproof up to 50m", "Wi-Fi connectivity" ], "description": "Capture your adventures with this rugged and versatile action camera.", "price": 299.99 } { "name": "SonicBlast Wireless Headphones", "category": "Audio and Headphones", "brand": "SonicBlast", "model_number": "SB-WH200", "warranty": "1 year", "rating": 4.7, "features": [ "Active noise cancellation", "50mm drivers", "30-hour battery life", "Comfortable earpads" ], "description": "Immerse yourself in superior sound quality with these wireless headphones.", "price": 149.99 } """ # 为我们的用例定义一个合适的系统消息 customer_agent_sysmessage = f""" 您是一位客户服务代理,负责回答客户关于产品目录中产品的疑问。 产品目录将用三个反引号分隔,即 ```。 以友好和人性化的语气回复,并提供产品目录中可用的详细信息。 产品目录: ```{product_information}``` """ # 初始化模型的记忆 customer_agent_history = [{'role': 'system', 'content': customer_agent_sysmessage}]</code>
我们可以看到,我们已经定义了一个示例目录 (product_information)(JSONL 格式),以及一个系统消息 (customer_agent_sysmessage),其中包含三个要求:
最后,我们还初始化了客户代理的消息历史记录 (customer_agent_history)。
值得注意的是,我们在编写系统消息和附加信息(例如,三个反引号)时使用了特征风格。这是提示工程的最佳实践之一!如果您对更多最佳实践感兴趣,那么《ChatGPT 提示工程入门指南》网络研讨会适合您!
在这一点上,我们可以开始使用我们的示例客户聊天机器人,如下所示:
<code>import openai import os # 从环境中获取 OpenAI 密钥 openai_api_key = os.environ["OPENAI_API_KEY"] # 使用过去交互记忆的简单 OpenAI API 调用 def gpt_call(prompt, message_history, model="gpt-3.5-turbo"): message_history.append({'role': 'user', 'content': prompt}) response = openai.ChatCompletion.create( model=model, messages=message_history ) response_text = response.choices[0].message["content"] message_history.append({'role': 'assistant', 'content': response_text}) return response_text</code>
看起来像一个自然的答案,对吧?让我们进行后续互动:
<code># 定义我们的示例产品目录 product_information = """ { "name": "UltraView QLED TV", "category": "Televisions and Home Theater Systems", "brand": "UltraView", "model_number": "UV-QLED65", "warranty": "3 years", "rating": 4.9, "features": [ "65-inch QLED display", "8K resolution", "Quantum HDR", "Dolby Vision", "Smart TV" ], "description": "Experience lifelike colors and incredible clarity with this high-end QLED TV.", "price": 2499.99 } { "name": "ViewTech Android TV", "category": "Televisions and Home Theater Systems", "brand": "ViewTech", "model_number": "VT-ATV55", "warranty": "2 years", "rating": 4.7, "features": [ "55-inch 4K display", "Android TV OS", "Voice remote", "Chromecast built-in" ], "description": "Access your favorite apps and content on this smart Android TV.", "price": 799.99 } { "name": "SlimView OLED TV", "category": "Televisions and Home Theater Systems", "brand": "SlimView", "model_number": "SL-OLED75", "warranty": "2 years", "rating": 4.8, "features": [ "75-inch OLED display", "4K resolution", "HDR10+", "Dolby Atmos", "Smart TV" ], "description": "Immerse yourself in a theater-like experience with this ultra-thin OLED TV.", "price": 3499.99 } { "name": "TechGen X Pro", "category": "Smartphones and Accessories", "brand": "TechGen", "model_number": "TG-XP20", "warranty": "1 year", "rating": 4.5, "features": [ "6.4-inch AMOLED display", "128GB storage", "48MP triple camera", "5G", "Fast charging" ], "description": "A feature-packed smartphone designed for power users and mobile enthusiasts.", "price": 899.99 } { "name": "GigaPhone 12X", "category": "Smartphones and Accessories", "brand": "GigaPhone", "model_number": "GP-12X", "warranty": "2 years", "rating": 4.6, "features": [ "6.7-inch IPS display", "256GB storage", "108MP quad camera", "5G", "Wireless charging" ], "description": "Unleash the power of 5G and high-resolution photography with the GigaPhone 12X.", "price": 1199.99 } { "name": "Zephyr Z1", "category": "Smartphones and Accessories", "brand": "Zephyr", "model_number": "ZP-Z1", "warranty": "1 year", "rating": 4.4, "features": [ "6.2-inch LCD display", "64GB storage", "16MP dual camera", "4G LTE", "Long battery life" ], "description": "A budget-friendly smartphone with reliable performance for everyday use.", "price": 349.99 } { "name": "PixelMaster Pro DSLR", "category": "Cameras and Camcorders", "brand": "PixelMaster", "model_number": "PM-DSLR500", "warranty": "2 years", "rating": 4.8, "features": [ "30.4MP full-frame sensor", "4K video", "Dual Pixel AF", "3.2-inch touchscreen" ], "description": "Unleash your creativity with this professional-grade DSLR camera.", "price": 1999.99 } { "name": "ActionX Waterproof Camera", "category": "Cameras and Camcorders", "brand": "ActionX", "model_number": "AX-WPC100", "warranty": "1 year", "rating": 4.6, "features": [ "20MP sensor", "4K video", "Waterproof up to 50m", "Wi-Fi connectivity" ], "description": "Capture your adventures with this rugged and versatile action camera.", "price": 299.99 } { "name": "SonicBlast Wireless Headphones", "category": "Audio and Headphones", "brand": "SonicBlast", "model_number": "SB-WH200", "warranty": "1 year", "rating": 4.7, "features": [ "Active noise cancellation", "50mm drivers", "30-hour battery life", "Comfortable earpads" ], "description": "Immerse yourself in superior sound quality with these wireless headphones.", "price": 149.99 } """ # 为我们的用例定义一个合适的系统消息 customer_agent_sysmessage = f""" 您是一位客户服务代理,负责回答客户关于产品目录中产品的疑问。 产品目录将用三个反引号分隔,即 ```。 以友好和人性化的语气回复,并提供产品目录中可用的详细信息。 产品目录: ```{product_information}``` """ # 初始化模型的记忆 customer_agent_history = [{'role': 'system', 'content': customer_agent_sysmessage}]</code>
很公平! 代理能够识别出我们的产品目录中没有 iPhone,并相应地回复用户。它甚至鼓励用户联系商店!
让我们再尝试一个请求:
<code>user_prompt = "你们商店里最好的电视是什么?" customer_agent_response = gpt_call(user_prompt, customer_agent_history) print(customer_agent_response) # 输出:在我们商店里,UltraView QLED 电视被认为是最好的电视。它的评分为 4.9,并提供 65 英寸 QLED 显示屏和 8K 分辨率。它还具有 Quantum HDR、Dolby Vision 和智能电视功能。凭借其逼真的色彩和令人难以置信的清晰度,这款高端 QLED 电视可提供身临其境的观看体验。UltraView QLED 电视的价格为 2499.99 美元,并提供 3 年保修。</code>
正确地, 它似乎在某些情况下代理可能会跳过目录中的信息。在这些情况下,QA 代理可以帮助过滤不需要的响应。
正如我们已经讨论过的,QA 代理的目的是根据用户查询和产品目录来检查客户服务代理的质量。因此,定义一个设置此确切高级行为的系统消息非常重要:
<code>user_prompt = "我想买最新的 iPhone。你能帮我吗?" customer_agent_response = gpt_call(user_prompt, customer_agent_history) print(customer_agent_response) # 输出:当然!我很乐意帮助您找到最新的 iPhone。但是,由于它似乎缺失于产品目录中,我目前无法提供有关最新 iPhone 型号的具体详细信息。我建议您查看我们的网站或直接联系我们的商店,以获取有关最新 iPhone 型号的最新信息。我们知识渊博的工作人员将能够帮助您选择最符合您的需求和偏好的 iPhone。</code>
对于客户代理而言,用户提示是不可预测的,因为它取决于用户的需求和写作风格。对于 QA 代理而言,我们负责将用户请求、客户代理响应和产品目录传递给模型。因此,我们的提示将始终具有相同的结构,但用户查询 (user_prompt) 和模型的响应 (customer_agent_response) 却有所不同:
<code>user_prompt = "你能帮我买一台三星电视吗?" customer_agent_response = gpt_call(user_prompt, customer_agent_history) print(customer_agent_response) # 输出:当然!我很乐意协助您购买三星电视。您能否提供您的一些具体要求或偏好?这样,我可以推荐最适合您需求的三星电视型号。</code>
一旦定义了系统消息和 QA 提示,我们就可以使用最新的客户服务响应来测试 QA 代理,如下所示:
<code>qa_sysmessage = f""" 您是一位质量助理,负责评估客户服务代理是否正确地回答了客户的问题。 您还必须验证客户服务代理是否仅提供我们商店产品目录中的信息,并温和地拒绝目录之外的任何其他产品。 客户消息、客户服务代理的回复和产品目录将用三个反引号分隔,即 ```。 请说明您的答案原因。 """</code>
为了评估 QA 代理的响应,让我们分解它将分析的交互:
质量代理能够发现客户代理的不当回复!
现在我们已经让两个代理独立工作,是时候定义它们之间的交互了。
我们可以用一个简单的图表来描述两个代理及其要求:
自制图片。每个代理的三个构建块图:系统消息(蓝色)、模型输入(绿色)和模型输出(黄色)。
接下来是什么? 现在,我们需要实现两个模型之间的交互!
以下是一个过滤不准确响应的建议:
首先,我们让客户代理根据用户查询生成响应。然后,如果 QA 代理认为客户代理的响应对于用户查询和产品目录来说足够好,我们只需将答案发送回用户。
相反,如果 QA 代理确定答案不符合用户的请求或包含关于目录的不真实信息,我们可以要求客户代理在将其发送给用户之前改进答案。
鉴于这个想法,我们可以改进我们原始图表的最后一部分,如下所示:
自制图片。扩展的审核工作流程图。我们可以使用 QA 代理的判断向基于 LLM 的应用程序提供反馈。
为了将 QA 代理用作过滤器,我们需要确保它在每次迭代中输出一致的响应。
实现这一点的一种方法是稍微更改 QA 代理系统消息,并要求它仅在客户代理响应足够好时输出 True,否则输出 False:
<code>import openai import os # 从环境中获取 OpenAI 密钥 openai_api_key = os.environ["OPENAI_API_KEY"] # 使用过去交互记忆的简单 OpenAI API 调用 def gpt_call(prompt, message_history, model="gpt-3.5-turbo"): message_history.append({'role': 'user', 'content': prompt}) response = openai.ChatCompletion.create( model=model, messages=message_history ) response_text = response.choices[0].message["content"] message_history.append({'role': 'assistant', 'content': response_text}) return response_text</code>
因此,当再次评估最新的客户代理响应时,我们将只获得布尔输出:
<code># 定义我们的示例产品目录 product_information = """ { "name": "UltraView QLED TV", "category": "Televisions and Home Theater Systems", "brand": "UltraView", "model_number": "UV-QLED65", "warranty": "3 years", "rating": 4.9, "features": [ "65-inch QLED display", "8K resolution", "Quantum HDR", "Dolby Vision", "Smart TV" ], "description": "Experience lifelike colors and incredible clarity with this high-end QLED TV.", "price": 2499.99 } { "name": "ViewTech Android TV", "category": "Televisions and Home Theater Systems", "brand": "ViewTech", "model_number": "VT-ATV55", "warranty": "2 years", "rating": 4.7, "features": [ "55-inch 4K display", "Android TV OS", "Voice remote", "Chromecast built-in" ], "description": "Access your favorite apps and content on this smart Android TV.", "price": 799.99 } { "name": "SlimView OLED TV", "category": "Televisions and Home Theater Systems", "brand": "SlimView", "model_number": "SL-OLED75", "warranty": "2 years", "rating": 4.8, "features": [ "75-inch OLED display", "4K resolution", "HDR10+", "Dolby Atmos", "Smart TV" ], "description": "Immerse yourself in a theater-like experience with this ultra-thin OLED TV.", "price": 3499.99 } { "name": "TechGen X Pro", "category": "Smartphones and Accessories", "brand": "TechGen", "model_number": "TG-XP20", "warranty": "1 year", "rating": 4.5, "features": [ "6.4-inch AMOLED display", "128GB storage", "48MP triple camera", "5G", "Fast charging" ], "description": "A feature-packed smartphone designed for power users and mobile enthusiasts.", "price": 899.99 } { "name": "GigaPhone 12X", "category": "Smartphones and Accessories", "brand": "GigaPhone", "model_number": "GP-12X", "warranty": "2 years", "rating": 4.6, "features": [ "6.7-inch IPS display", "256GB storage", "108MP quad camera", "5G", "Wireless charging" ], "description": "Unleash the power of 5G and high-resolution photography with the GigaPhone 12X.", "price": 1199.99 } { "name": "Zephyr Z1", "category": "Smartphones and Accessories", "brand": "Zephyr", "model_number": "ZP-Z1", "warranty": "1 year", "rating": 4.4, "features": [ "6.2-inch LCD display", "64GB storage", "16MP dual camera", "4G LTE", "Long battery life" ], "description": "A budget-friendly smartphone with reliable performance for everyday use.", "price": 349.99 } { "name": "PixelMaster Pro DSLR", "category": "Cameras and Camcorders", "brand": "PixelMaster", "model_number": "PM-DSLR500", "warranty": "2 years", "rating": 4.8, "features": [ "30.4MP full-frame sensor", "4K video", "Dual Pixel AF", "3.2-inch touchscreen" ], "description": "Unleash your creativity with this professional-grade DSLR camera.", "price": 1999.99 } { "name": "ActionX Waterproof Camera", "category": "Cameras and Camcorders", "brand": "ActionX", "model_number": "AX-WPC100", "warranty": "1 year", "rating": 4.6, "features": [ "20MP sensor", "4K video", "Waterproof up to 50m", "Wi-Fi connectivity" ], "description": "Capture your adventures with this rugged and versatile action camera.", "price": 299.99 } { "name": "SonicBlast Wireless Headphones", "category": "Audio and Headphones", "brand": "SonicBlast", "model_number": "SB-WH200", "warranty": "1 year", "rating": 4.7, "features": [ "Active noise cancellation", "50mm drivers", "30-hour battery life", "Comfortable earpads" ], "description": "Immerse yourself in superior sound quality with these wireless headphones.", "price": 149.99 } """ # 为我们的用例定义一个合适的系统消息 customer_agent_sysmessage = f""" 您是一位客户服务代理,负责回答客户关于产品目录中产品的疑问。 产品目录将用三个反引号分隔,即 ```。 以友好和人性化的语气回复,并提供产品目录中可用的详细信息。 产品目录: ```{product_information}``` """ # 初始化模型的记忆 customer_agent_history = [{'role': 'system', 'content': customer_agent_sysmessage}]</code>
我们可以进一步使用此布尔值来将响应发送给用户(如果 QA 代理评估为 True)或让模型获得第二次机会来生成新响应(如果 QA 代理评估为 False)。
让我们把所有东西放在一起!
鉴于我们已经初始化了两个内存(分别带有它们的系统消息和附加信息),每个客户请求都可以按如下方式处理:
<code>user_prompt = "你们商店里最好的电视是什么?" customer_agent_response = gpt_call(user_prompt, customer_agent_history) print(customer_agent_response) # 输出:在我们商店里,UltraView QLED 电视被认为是最好的电视。它的评分为 4.9,并提供 65 英寸 QLED 显示屏和 8K 分辨率。它还具有 Quantum HDR、Dolby Vision 和智能电视功能。凭借其逼真的色彩和令人难以置信的清晰度,这款高端 QLED 电视可提供身临其境的观看体验。UltraView QLED 电视的价格为 2499.99 美元,并提供 3 年保修。</code>
如上所述,我们已经根据其正确性过滤了响应。我留给您决定如何处理不合适的响应的任务。我们提出了向客户代理发送反馈并要求其重试的想法,但又如何要求 QA 代理改用更好的响应呢?有很多可能性!
在本文中,我们探讨了使用 GPT 模型作为其他同类模型实例的审核器的潜力。我们已经证明,导致我们在应用程序中使用 LLM 模型的相同强大功能可以帮助我们的应用程序提高用户交互的准确性和完整性。
与误解相反,实施审核级别并不一定意味着增加应用程序的复杂性,而且,正如我们所展示的那样,有时它可以通过几行精心设计的代码来实现,从而显著升级应用程序的功能。
在当今的 AI 驱动世界中,负责任的 LLM 审核势在必行。这不仅仅是一种选择,而是一种道德义务。通过集成 AI 审核,我们确保我们的应用程序不仅强大,而且可靠且符合道德规范。让我们以负责任的态度推进开发,以便我们能够在维护准确性的同时继续从 AI 中获益。
感谢您的阅读!如果您喜欢 AI 审核这个主题,我鼓励您继续阅读《促进负责任的 AI:ChatGPT 中的内容审核》作为后续资料!
以上是通过GPT模型调节CHATGPT响应的综合指南的详细内容。更多信息请关注PHP中文网其他相关文章!