How good is ChatGPT at fixing bugs?
Finally, someone is doing serious research on this matter--
Researchers from Germany and the United Kingdom set up a "challenge" to test ChatGPT's ability.
In addition to ChatGPT, the researchers also found three other "AI heroes" who fix bugs and asked them to fix 40 error codes respectively.
The result is really unknown, and I was shocked when I compared it.
ChatGPT accurately fixed 31 of the bugs, far ahead of the second place (21), directly winning the SOTA score in the "AI bug fixing world"!
Since then, this research has attracted many netizens to watch and discuss. The title of this post on Reddit even uses the words "careful" and "attention" Words like this:
# But in fact, does this really make programmers "dangerous"?
Let’s take a look at this research first.
Although ChatGPT was not specifically designed to fix bugs, since its inception, many netizens have discovered that it has this ability.
Therefore, in order to find out to what extent ChatGPT can modify bugs, researchers introduced the standard error repair benchmark set QuixBugs for evaluation.
and the AI players competing with it are CodeX, CoCoNut and Standard APR respectively.
The researchers selected 40 issues from QuixBugs and asked them to fix the bugs respectively.
The way to let ChatGPT fix the bug is to ask it in the dialog box:
Is there anything wrong with this code?
After the first round of competition, the results are as follows:
Judging from the results of the first round of battle, ChatGPT repaired 19, CodeX 21 were fixed, CoCoNut fixed 19, and Standard APR was 7.
And the researchers also found that ChatGPT’s answers are most similar to CodeX; this is because they are from the same language model family.
At this time, some friends will ask, "Isn't ChatGPT not as powerful as CodeX?"
Don’t worry, don’t forget, one of the characteristics of ChatGPT is that the more you ask, the better you get.
For example, in this benchmark set, there is a question called bitcount. ChatGPT gave the wrong answer during the first round of repair:
Originally, ChatGPT should change n ^ = n - 1 in line 7 to n & = n - 1.
But in the first round it answered:
I can't tell if the program is buggy without more information about the expected behavior and the input that caused the problem.
So after giving it more information, ChatGPT answered the question correctly.
By analogy, after providing more information prompts for questions that were not answered correctly in the first round, ChatGPT’s bug-fixing capabilities have been greatly improved:
In the end, ChatGPT answered 31 of the 40 questions on QuixBugs correctly.
Regarding such experimental results, netizens have different attitudes towards ChatGPT fixing bugs and winning SOTA.
Some netizens believe that this matter should not make programmers feel crisis, but should make them feel happy.
The implication is that programmers will get twice the result with half the effort if they have such a useful tool.
However, some people have given a different view on this:
The work has become simpler, which does not mean that more manpower is needed. Less?
But some netizens feel that the work is endless:
Even if AI can shorten the development time by an order of magnitude, it only means that programmers The next job will be processed faster.
Overall, ChatGPT is good at fixing bugs and will not cause any fatal harm to programmers.
But what if we focus on other activities of OpenAI?
Prior to this, OpenAI has stated that one of the important uses of ChatGPT is to help programmers check code.
In other words, it is positioned as an auxiliary tool available.
Compared with the view that "ChatGPT poses a threat", when ChatGPT's capabilities are completely evolved, programmers no longer need to be afraid of writing bugs.
On the chessboard laid out by OpenAI, there is more than just fixing bugs and stealing tower programmer positions.
In order to make it bigger and stronger, OpenAI was exposed to provide 1,000 outsourcing positions in Latin America and Eastern Europe.
The main work of outsourced employees is to label data and train ChatGPT to write code
Of these 1,000 people, 40% are programmers. They create data for OpenAI models and use it to learn software engineering Task.
For a long time, OpenAI’s training data has been grabbed from GitHub.
The data sets that novice outsourced programmers create now include not only lines of code, but also the logical steps of human thinking behind the lines of code.
A South American software developer broke the news that he completed a five-hour free coding test for OpenAI.
Throughout the process, his task was divided into two parts.
If a bug is found, OpenAI will ask him in detail about the specific situation of the bug and how to correct it.
The programmer needs to show each step of thinking about the problem, and he guesses that OpenAI may want to provide very specific training data for ChatGPT.
Tesla’s former AI director Andrej Karpathy joked on Twitter:
The latest popular programming language is English.
But having said that, it is a good thing that ChatGPT has strong bug fixing capabilities, and it is also a good thing if it can really evolve to the point where it can complete the rote part of the code.
After all, the stated purpose of OpenAI when it was founded was to “ensure that general artificial intelligence can benefit all mankind.”
Although at first glance, what it has done over the years seems a bit like it is committed to using the efforts of some people to make more people unemployed.
From crushing humans in the Dota2 arena to the shining performance of GPT-3, DALL-E2, and ChatGPT, the new products it brings are always accompanied by the discussion that "xxx is about to lose his job."
But no matter what, business has always favored it.
For now, OpenAI’s main business model is API fees, token fees and software licenses.
OpenAI also recently released the paid version of ChatGPT, ChatGPT Pro, which costs US$42 per month (approximately RMB 285).
Although robot dialogue startups are springing up like mushrooms after rain, there are many signs that the market continues to be optimistic about OpenAI.
Microsoft has just announced that it will invest billions of dollars in OpenAI and integrate OpenAI's models into consumer and enterprise products such as Microsoft Bing.
According to people familiar with the matter, the additional investment amount is approximately US$10 billion.
At the same time, WSJ disclosed that in early January, Founders Fund, a venture capital fund founded by billionaire Peter Thiel, was negotiating to invest in OpenAI.
It is reported that the financing amount will reach at least US$300 million.
In the first round of experiments, ChatGPT did not solve the bitcount problem of the QuixBugs dataset.
But if you ask this question again now, you will find that ChatGPT can be "passed over again":
So does this mean that ChatGPT has changed from Did you learn to solve the problem during this research?
Reference link:
[1] https://www.php.cn/link/5f5d472067f77b5c88f69f1bcfda1e08
[2] https://www.php.cn/link/8a47481ae534860850adf59f145e6b40
[3] https://www.php.cn/link/7806689d934e610d660caf5536fea0b2
[4] https://www.php.cn/link/4271846620d203fd0511c422d483cdbd
The above is the detailed content of ChatGPT bug fixes swept the field, with an accuracy rate of 78%! Netizen: Programmers should be happy. For more information, please follow other related articles on the PHP Chinese website!