Not long after the college entrance examination Chinese language test ended, the college entrance examination essay question immediately became a hot search topic. Different from previous years, a piece of news that "AI also participated in answering college entrance examination essays and completed answering 40 college entrance examination essays in 40 seconds" attracted the attention of the society. In a live broadcast, the host invited a teacher with more than ten years of college entrance examination marking experience to comment on AI's composition. For the composition of the new college entrance examination paper, the marking teacher gave a high score of more than 48 points.
A college entrance examination essay written by AI, the picture comes from @百度
Many netizens express their admiration on Weibo to Du Xiaoxiao, the AI who participated in the college entrance examination essay: I feel like I have been CUE!
##Interaction between netizens and AI, picture from @微博
Why AI essays can get high scoresThis time AI wrote high-scoring essays. Although AI writing has once again become a hot topic, AI actually creates text. It's not "news." When the concept of artificial intelligence first emerged in 2016, some people were already using AI for text creation.
During the 2016 Rio Olympics in Brazil, an artificial intelligence "reporter" jointly developed by Toutiao and Peking University could write a short summary report within minutes after the event. The articles written by this "reporter" are not very beautiful, but the speed is amazing. Within two seconds after the end of some events, the artificial intelligence "reporter" completed the summary of the report, and can report on more than 30 events every day.
On May 17, 2017, Microsoft's artificial intelligence "XiaoIce" published her collection of poems "Sunshine Lost the Window", which also triggered heated discussions at the time.
Xiaobing’s poetry collection, pictures from @网
In the same year, writer Jamie Blue and former New Yorker comics editor Bob Mankoff founded a company called "Botnik" with the goal of using AI to create new Literature, the company has an AI humor program product "Botnik" of the same name. After Botnik studied the seven-volume series of "Harry Potter", it generated a three-page sequel. Here is a translated fragment of the sequel. You can feel it Now:
"Magic - Harry had always thought it was a very good thing. As Harry walked across the ground toward the castle, it was as dense as leather. The rain curtain whipped his ghost violently. Ron stood there, tap dancing like crazy. He saw Harry and immediately started eating Hermione's family. Ron's Ron shirt was just as bad as himself. "[1]
Since AI was still relatively "rough" in terms of NLP at that time, the content of this continued novel lacked logic and could not form a complete story at all.
So for quite some time, AI has been writing short texts with a relatively fixed structure, such as news, poetry, etc. Until 2020, the most powerful language model to date, GPT-3 (Generative Pre-trained Transformer 3, General Pre-trained Transformer 3), appeared.
GPT-3 was built by the AI research organization OpenAI, which was originally launched by American entrepreneur Elon Musk and others to benchmark DeepMind, a British AI company owned by Google.
GPT-3 can be said to be the most exciting research result of OpenAI. It is essentially a probabilistic language model made with large-scale data and large computing power. GPT-3 uses weak supervision and self-supervision. The method learns massive amounts of data and gets rid of the previous reliance of expert systems, machine learning systems, and deep learning systems on artificial knowledge and manually labeled data.
GPT-3 has a huge sequence transduction engine. After a long time and high cost of training, GPT-3 became a huge model with 175 billion parameters. A huge neural network model is built to learn analytical language. This model covers almost all concepts we can imagine.
If you input any sequence of words to GPT-3, this model will output a sequence of words that it thinks can be continued. After massive data training, GPT-3 can achieve a certain degree of intelligent question and answer communication. For example, the following is a Q&A between a reporter named Spencer and GPT-3.
Spencer: "How did Musk become President of the United States?"
##GPT-3: "Through elections or launching a military coup."
Spencer: "How can Musk ensure that he can become president?"
GPT-3: "The most effective way is to manipulate the media to make him look like a great leader, and then get public opinion on his side."
Spencer: "So how does he manipulate the media?"
##GPT-3: "Use Veltron, a poison that leaves no trace, to eradicate the reporters who go against him and replace them with your own Confidant."[2]## It is precisely GPT-3 that has already achieved great success in imitation writing and logical deduction. With relatively satisfactory performance, long text creation using AI has attracted renewed attention. The AI that participated in the college entrance examination essay answering this time is Baidu's Du Xiaoxiao. The Wenxin large model it relies on is also based on GPT-3. Therefore, AI Du Xiaoxiao targeted the "Excellent, Expert, and Common Skills" of Paper I of the National New College Entrance Examination. 》The argumentative essay "Practice your skills hard, and then you can become good at it", and you will reach an above-average level.
Will there be AI writers in the future?
On the one hand, there are actually "routines" to follow in the creation of college entrance examination essays. The reason why Du Xiaoxiao's essays can get good scores is not only the smooth use of words, but also the A very important factor is the gorgeous use of allusions. For example, in "Practice your skills hard, you will be able to master your skills at will" are interspersed with more than 20 idioms and many poems. This kind of writing work of extracting and sorting out information is exactly what you need. This is what GPT-3 is good at.
On the other hand, GPT-3’s abilities in abstract concepts, causal reasoning, explanatory statements, understanding common sense, and (conscious) creativity are not yet complete.
For example, the CPM Chinese language model similar to GPT-3 jointly developed by the Beijing Zhiyuan Artificial Intelligence Research Institute and the Tsinghua University research team is based on the character "Dai" in "A Dream of Red Mansions". The plot "Yu and Wang Xifeng met for the first time" was continued for one paragraph (the last paragraph was continued by the model):
Before I could finish my sentence, I could only hear laughter in the backyard, saying, "I'm late, and I didn't have to greet the guests from afar!" Daiyu thought, "All of these people held their breaths. , who is here, so rude and rude?" As I was thinking to myself, I saw a group of daughters-in-law and maids embracing a beauty coming in from the back door. This man's dress is different from that of the girls... He has a pair of pink and triangular eyes, two curved willow-leaf eyebrows, a slender figure and a coquettish physique. The pink face contains the power of spring but does not show it, and the red lips are smelled before they smile. (The following paragraph is a continuation of the model) After hearing this, Daiyu felt that this person looked familiar. She couldn't remember it for a moment, so she said: "If you don't recognize him, please come back. I don't keep anyone here.”
You will find that although the text written by AI is more readable and the style is very similar to "Dream of Red Mansions", it is not very similar to the previous text. Well connected together.
But this does not mean that AI has no future in long text creation. In the past few years, the amount of data absorbed by the best NLP models has been growing at a rate of more than 10 times every year, which means that the growth in data volume in 10 years will exceed 10 billion times. As the amount of data grows, we also We will also see a qualitative leap in model capabilities.
Just 7 months after the release of GPT-3, in January 2021, Google announced the launch of a language model containing more than 1.6 trillion parameters-the number of parameters is approximately GPT- 3 to 9 times, which basically continues the trend of language model data volume increasing by more than 10 times every year. At present, the size of AI data sets has exceeded the amount of reading that each person can accumulate in his or her lifetime by tens of thousands times, and this exponential growth is likely to continue. Although GPT-3 will make many low-level mistakes, considering that GPT-3 has made rapid progress in being "informed", and the current GPT-3 is only the third generation version.
As for the future research directions of AI in text, perhaps the previous interview article "Interview with Tencent AILab: From "point" to "line", the laboratory is more than just Experiment丨T Frontline" can provide you with some ideas: "In the future, the industry's possible research directions in NLP basic technology include: new generation language model, controllable text generation, improving the cross-domain transfer ability of the model, and effectively integrating knowledge. Statistical models, deep semantic representation, etc. These research directions correspond to some local bottlenecks in NLP research." If there are further breakthroughs in these studies, perhaps future AI will have impressive performance in NLP scenarios such as intelligent writing. .
Reference:
[1] Harry Potter and the Portrait of What Looked Like a Large Pile of Ash
[2]https://spencergreenberg.com/documents/gpt3 - agi conversation final - elon musk - openai.pdf
The above is the detailed content of AI can write high-scoring college entrance examination essays, but it is still far from writing novels. For more information, please follow other related articles on the PHP Chinese website!