Add "deep breath" to the prompt word, and the math score of the AI large model will increase by another 8.4 points!
The latest discovery of the Google DeepMind team is to use this new "spell" (Take a deep breath) combined with that everyone is already familiar with "think step by step" (Let's think step by step), the score of the large model on the GSM8K data set increased from 71.8 to 80.2 points.
And this most effective prompt word was found by AI itself.
Some people joke that when you take a deep breath, the speed of the cooling fan will increase
Some people think, Newly hired high-paid engineers should also calm down, because their jobs may not last long
Related papers"Big Language Models Are Optimization Device》, once again caused a sensation.
Specifically, the prompt words designed by the big model can be improved by up to 50% on the Big-Bench Hard data set.
Some people also focus on
"The best prompt words for different models are different" .
In the paper, not only the task of prompt word design, but also the ability of large models on classic optimization tasks such as linear regression and traveling salesman problem were tested
Different models have different optimal prompt words
To solve this problem, the team developed a new method
OPRO, which is optimization by prompt words (Optimization by PROmpting). Instead of defining optimization problems formally and solving them with programs, we describe optimization problems through natural language and require large models to generate new solutions.
A picture flow summary is the A recursive call for large models.
In each step of optimization, the previously generated solutions and scores are used as input, and the large model generates new solutions and scores, and then adds them to the prompt words for Optimize usage in the next step.
The paper mainly uses Google’s
PaLM 2 and Bard’s text-bison version serves as the evaluation model. As an optimizer, we will use four models, including GPT-3.5 and GPT-4
The research results show that different models design prompt word styles and applicable prompt word styles They are also different.
The optimal prompt word designed by the AI in the GPT series was
"Let's work this out in a step by step way to be sure we have the right answer ."This prompt word was designed using the APE method. The paper was published on ICLR 2023 and exceeded the human-designed version on GPT-3 (text-davinci-002). "Let's think step by step".
On Google-based PaLM 2 and Bard, the APE version performed worse than the human version in this benchmark test
Among the new prompt words designed by
OPRO method, "take a deep breath"and"disassemble This question" works best with PaLM.
For the text-bison version of the Bard large model, it is more inclined to provide more detailed prompt words
In addition, the paper also shows the large The model's potential as a mathematical optimizer
Linear regressionas an example of a continuous optimization problem.
The traveling salesman problem serves as an example of a discrete optimization problem.
#Just by hinting, large models can find good solutions, sometimes even matching or surpassing hand-designed heuristics.
However, the team also believes that large models cannot yet replace traditional gradient-based optimization algorithms. When the problem scale is large, such as the traveling salesman problem with a large number of nodes, the performance of the OPRO method is not ideal.
The team put forward ideas for future improvements. They believe that current large models cannot effectively utilize error cases, and simply providing error cases cannot allow large models to capture the causes of errors.
A promising direction is to incorporate richer feedback on error cases and Summarize key characteristic differences between high-quality and low-quality generated cues in optimization trajectories.
This information may help the optimizer model improve past generated hints more effectively, and may further reduce the number of samples required for hint optimization
The paper comes from the merged department of Google and DeepMind, but the authors are mainly from the original Google Brain team, including Quoc Le, Zhou Dengyong.
同一 is a Fudan alumnus who graduated with a Ph.D. from Cornell UniversityChengrun Yang, and an alumnus of Shanghai Jiao Tong University who graduated with a Ph.D. from UC Berkeley陈昕昀.
The team also provided many of the best prompt words obtained in experiments in the paper, including practical scenarios such as movie recommendations and spoof movie names. If you are in need, you can refer to
Paper address: https://arxiv.org/abs/2309.03409
The above is the detailed content of AI independently designed prompt words, Google DeepMind found that 'deep breathing' in mathematics can increase large models by 8 points!. For more information, please follow other related articles on the PHP Chinese website!