ChatGPT has caused a huge shock in the industry, and all walks of life are discussing large language models and general artificial intelligence. AI has experienced more than fifty years of development and is now in a critical period of horizontal development of the industrial structure. This change stems from the paradigm shift in the field of NLP, which has evolved from "pre-training and fine-tuning" to "pre-training, prompting, and prediction". In this new model, downstream tasks adapt to the pre-trained model, making a large model suitable for multiple tasks. This change has laid the foundation for the horizontal division of labor in the AI industry. Large language models have become infrastructure. Prompt Engineering companies have emerged one after another, focusing on connecting users and models. The division of labor in the AI industry has initially taken shape, including underlying infrastructure (cloud service providers), large models, Prompt Engineering platforms and terminal applications. As the AI industry changes, developers can make full use of large language models (LLM) and Prompt Engineering to develop innovative applications.
Currently we need to develop an application based on LLM. What is the biggest engineering problem we face?
Take Langchain as an example to put it simply: LangChain is an encapsulation of LLM’s underlying capabilities and is a kind of Prompt Engineering or It's Prompt-Ops.
This is an example of comparing the demo developed using ChatGPT and LangChain. The input is "Who is Jay Chou's wife? What is her current age multiplied by 0.23?". It can be seen that the answer results of ChatGPT or GPT-3.5 are wrong because they do not have search capabilities. The API using LangChain combined with OpenAI's GPT-3.5 on the right outputs the correct result. It will gradually search for the correct information and get the correct result, and the intermediate process is automatically handled by the framework. I have no other operations except entering questions.
This is a very shocking example. In this process, it discovered that the function was not defined and reported an error. Correct yourself.
##2.2.3 Query NBA data using GPT-3 Statmuse LangchainFuzzy API composition: querying NBA stats with GPT-3 Statmuse LangchainUse Langchain combined with sports data search sites to ask complex data questions and get accurate responses. For example: "What are the Boston Celtics' average defensive points per game this 2022-2023 NBA season? How does the percentage change compare to their average last season?" 2.2.4 Connect to Python REPL and open the browser to play musicA pretty sci-fi scene, I used Langchain to connect to the Python REPL tool, entered "play me a song", and it imported I installed the webBrowser package, called the code to open the browser, and played the song "never gonna give you up" for medef pythonTool(): bash = BashProcess() python_repl_util = Tool( "Python REPL", PythonREPL().run, """A Python shell. Use this to execute python commands. Input should be a valid python command. If you expect output it should be printed out.""", ) command_tool = Tool( name="bash", descriptinotallow="""A Bash shell. Use this to execute Bash commands. Input should be a valid Bash command. If you expect output it should be printed out.""", func=bash.run, ) # math_tool = _get_llm_math(llm) # search_tool = _get_serpapi() tools = [python_repl_util, command_tool] agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True) agent.run("给我播放一首音乐")
连接私有数据对第三方企业做LLM应用来说非常重要。下面举几个例子
法律公文和政策条款一般都非常复杂繁琐,这个demo中将旧金山政府的信息用Langchain与GPT结合,做到询问其中细节能获得准确回复的效果。
> Entering new AgentExecutor chain... I need to find out the size limit for a storage shed without a permit and then search for sheds that are smaller than that size. Action: SF Building Codes QA System Action Input: "What is the size limit for a storage shed without a permit in San Francisco?" Observation: The size limit for a storage shed without a permit in San Francisco is 100 square feet (9.29 m2). Thought:Now that I know the size limit, I can search for sheds that are smaller than 100 square feet. Action: Google Action Input: "Storage sheds smaller than 100 square feet" Observation: Results 1 - 24 of 279 ... Thought:I need to filter the Google search results to only show sheds that are smaller than 100 square feet and suitable for backyard storage. Action: Google Action Input: "Backyard storage sheds smaller than 100 square feet" Thought:I have found several options for backyard storage sheds that are smaller than 100 square feet and do not require a permit. Final Answer: The size limit for a storage shed without a permit in San Francisco is 100 square feet. There are many options for backyard storage sheds that are smaller than 100 square feet and do not require a permit, including small sheds under 36 square feet and medium sheds between 37 and 100 square feet.
LLM应用与私有数据交互非常重要,我看到无数人在问一些ChatGPT无法回答的问题了:问认不认识谁、问自己公司业务细节、问各种可能不包含在预训练数据集里的东西。这些都已用Langchain和LlaMaIndex来解决。试想一下,将私有数据与LLM相结合,将改变数据原有的访问方式,通过问答能很自然地获取到自己需要的信息,这是比当前的搜索/打标分类都要高效的数据交互方式。
向量数据库现在看起来是构建LLM App中很关键的一个组件。首先 LLM 的预训练和微调过程不可能包含我们所期待的私有数据,因此如何将LLM关联到私有数据成为一个很关键的需求。而且LLM的“接口”-自然语言通常不是像Key-Value的映射那样精确地。而且在这一阶段我们希望LLM去理解我们的知识库,而不是简单的在其中搜索相同的字符串,我们希望询问关于我们知识库的细节,并给出一定理解后的答案(以及来源),这样匹配向量这样的搜索方式是一个非常合适且关键的解决方案。还有一个关键点是,LLM在每次调用是按token计费(即文本量),并且目前的接口的上下文有着4096 tokens的限制。,因此面对庞大的数据,我们也不可能将所有的数据一次性传给LLM。因此才有了第一张图那个流程图的结构。本地预先将我们私有的数据转成向量存在Qdrant里,用户问答时,将用户的问题转为向量,然后去Qdrant里进行搜索(相似性匹配)得到Top K个结果,然后将这些结果(注意这里的结果已经是自然语言了)传给LLM进行总结输出。
这里使用Langchain社区博客的流程图为例
私有数据分割成小于LLM上下文的分块,创建向量后存入向量数据库
将问题计算向量后在向量数据库进行相似性搜索,算出相关性较高的top k个结果后拼接prompt送往LLM获得答案。
Let’s talk about the recent news about OpenAI private deployment. If Langchain is used for linking, facing huge private data, Using an embedding model (OpenAI's ada) to calculate the input problem vector, using vector databases such as Qdrant to manage private data vectors and vector searches, and using Langchain as the intermediate link can solve the problem, but the consumption of tokens cannot be ignored. cost issue. Private deployment fine-tuning may solve most of the previously mentioned problems. It may be that big, wealthy companies use Model instances and fine-tuning, while independent developers in small companies use frameworks such as Langchain. In the future, when OpenAI's LLM service capabilities overflow, Prompt may no longer be needed, and the functions of Langchain may even be included. The development and access of LLM applications may only require an interface call.
2.4 LLM application technology stack in 2023
2023 The latest technology stack used to simply build AI Demo:
Some of the Prompt-Ops like Langchain Opposition to class tools: stream.thesephist.com The main problem is that in this class of tools/frameworks, using natural language as the connection between code and LLM, and using non-deterministic language itself as the control flow, is a bit crazy. Moreover, evaluating the model output effect itself is now a very troublesome task. There is no good solution. Many of them maintain a huge spreadsheet and rely on humans to evaluate it. (There are also plans to use LLM to evaluate LLM, which is still relatively early.) Therefore, there may still be a lot of work to be done before it is put into production and actually faces users rather than as a twitter demonstration.
Let’s talk in detail about the huge challenges faced in the testing process. If your product has a set of prompts that work well during the development stage, after it is handed over for testing, you may be able to identify problems by testing hundreds or thousands of them. Since the effect cannot be guaranteed, it will face great challenges to actually launch it to c-end users. And if you do not use fine-tuning services or model instances, if OpenAI updates the model, all prompts in your production environment may need to be retested for effects. Your prompts also need to be managed by version just like the code. Regardless of whether there are prompt changes or not, each version needs to be regression tested before going online. Without a good automated assessment solution, a large number of cases would need to be tested manually, which would consume a lot of manpower.
There are many good engineering solutions for developing LLM applications that combine private data. It is easy to run a demo with good results, but such an application still needs to be treated with great caution. After all, we are not just doing a project to demonstrate in front of social media or leaders. What is provided to the user for input is a dialog box. Natural language is so broad that even if you test tens of thousands of results, unexpected results may occur. After all, products like new bing and chatGPT will also be prompted for injection. Faced with this uncertainty, how to avoid it in engineering and how to cover it in testing are all issues that need to be solved for mature products or there is still a lot of work that can be done.
But I don’t think there is any need to completely deny this type of Prompt-Ops tool/framework. After all, many good demos can indeed be made to verify ideas at this stage.
Let’s talk about the possible forms of LLM applications after the ChatGPT API is opened.
LLM application is actually a new way of human-computer interaction, which allows users to communicate with our current system using natural language. Many applications can even be simplified to only a chat window.
At present, due to the high cost of general large model training/deployment, the conditions for industrial level division of labor are basically mature. There is no need for many large models in the world. The application of LLM will be an inevitable choice for small and medium-sized enterprises and individual developers. New forms of programming/engineering paradigms require engineers to learn and understand them in a timely manner. The current open source technology stack can already meet the needs of most products. You can try a quick demo to verify your ideas.
Reference:
Tutorial: ChatGPT Over Your Data
Question Answering with LangChain and Qdrant without boilerplate
Atom Capital: In-depth discussion of the industrial changes brought about by ChatGPT
The above is the detailed content of ChatGPT sharing-How to develop an LLM application. For more information, please follow other related articles on the PHP Chinese website!