Table of Contents
Has the measured GPT-4 “alchemy” ability declined?
Aligning with humans reduces AI capabilities
Home Technology peripherals AI GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

Jun 03, 2023 am 11:37 AM
gpt-4 code text

Large model ceiling GPT-4, has it...become stupid?

First a few users raised questions, and then a large number of netizens said they had noticed it and posted a lot of evidence.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

Some people reported that they used up the 3 hours and 25 dialogue quotas of GPT-4 in one go, and still did not solve their own code problems.

I had no choice but to switch to GPT-3.5, but it solved the problem.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

To summarize everyone’s feedback, the most important manifestations are:

  • Before GPT-4 could write The correct code is now full of bugs
  • The depth and analysis of answering questions have become less
  • The response speed is faster than before

This has caused a lot of people I wonder if OpenAI is cutting corners to save costs?

Two months ago GPT-4 was the world's greatest writing assistant, and a few weeks ago it started to fall into mediocrity. I suspect they cut back on the computing power or made it less intelligent.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

This inevitably reminds people of Microsoft's new Bing, which "reached its peak when it debuted", but later suffered "frontal lobotomy surgery" to change its ability. Bad things...

After netizens shared their experiences with each other, it became everyone's consensus that "it started to get worse a few weeks ago."

A storm of public opinion also formed in technical communities such as Hacker News, Reddit and Twitter.

Now the officials can’t sit still.

OpenAI Developer Promotion Ambassador Logan Kilpatrick responded to a netizen’s question:

The API will not change without us notifying you. The model there is at rest.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

Worried netizens continued to ask for confirmation, "That means GPT-4 has been static since it was released on March 14, right?" ?", also received a positive answer from Logan.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

"I noticed inconsistent performance for some prompt words, is it just due to the instability of the large model itself?", also got " Yes" reply.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

But so far, the two questions about whether the web version of GPT-4 has been downgraded have not been answered, and Logan has not received any answers during this period. There is other content posted.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

So what exactly is going on? Why not try it yourself.

As netizens generally mentioned that GPT-4’s coding skills have become worse, we conducted a simple experiment.

Has the measured GPT-4 “alchemy” ability declined?

At the end of March, we experimented with letting GPT-4 "make elixirs" and write a multi-layer perceptron in Python to implement an XOR gate.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

△ShareGPT screenshot, the interface is slightly different

After changing GPT-4 to use numpy without a framework, the first time The result is wrong.

After modifying the code twice, the correct result was obtained. The first time is to modify the number of hidden neurons, and the second time is to change the activation function from sigmoid to tanh.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

On June 2, we tried again to let GPT-4 complete this task, but changed to Chinese prompt words.

This time GPT-4 did not use the framework for the first time, but the code given was still wrong.

After only one modification, the correct result was obtained, and the idea was changed to the idea of ​​​​directly increasing the number of training epochs and learning rate.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

# The speed does feel faster.

Due to limited time, we only conducted this experiment, and due to the randomness of AI itself, we cannot deny the observations of netizens.

Some people reported feedback as early as April 19th

We searched in the OpenAI official Discord channel and found that starting from late April, sporadic users reported that GPT-4 had become worse. GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

However, these feedbacks did not trigger large-scale discussions and did not receive an official official response.

On May 31, Hacker News and Twitter began to have a large number of netizens discuss this issue on the same day, becoming a key node in the entire incident.

HackerNews A netizen pointed out that the GPT-4 avatar was stronger when it was black, but now the purple avatar version will lose a few lines when modifying the code.

The person who raised this issue earlier on Twitter was Matt Shumer, CEO of HyperWrite (a writing tool developed based on GPT API). GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

######But this tweet resonated with many netizens, and OpenAI employees responded to this tweet. ######However, these responses did not satisfy everyone. Instead, the scope of the discussion became wider and wider. ######For example, a post on Reddit mentioned that GPT-4, which was originally able to answer code questions, now can’t even tell which ones are code and which ones are questions. #####################After being questioned by other netizens, the author of the post gave an overview of the process of the problem and also attached the chat record with GPT. ##################### Regarding OpenAI’s claim that the model has not been changed since March, there is indeed no relevant record at the public level. ######In the update log of ChatGPT, updates to the model itself were mentioned on January 9, January 30, and February 13 respectively, involving improvements in factual accuracy and mathematical capabilities. ######But since the release of GPT-4 on March 14, there has been no mention of model updates. There are only changes in web APP function adjustments and the addition of networking mode, plug-in mode, Apple APP, etc. ##################### Assuming that, as OpenAI said, the capabilities of the GPT-4 model itself have not changed, then why do so many people feel that its performance has deteriorated? What's going on? ######Many people also gave their own guesses. ######The first possible reason is psychological. ###### François Chollet, founder of Keras, said that it is not that the performance of GPT has deteriorated, but that everyone has passed the initial surprise period and their expectations for it have become higher. ##################### Some netizens on Hacker News also held the same view and added that people’s focus has changed and they are more sensitive to GPT mistakes. . ###

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

Putting aside the differences in people’s psychological feelings, some people also suspect that the API version and the web version are not necessarily consistent, but there is no solid evidence.

Another guess is that when the plug-in is enabled, the extra prompt words of the plug-in may be considered a kind of pollution to the problem to be solved.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

△Additional prompt words in the WebPilot plug-in

This netizen said that in his opinion, the performance of GPT has deteriorated. It started after the plug-in function started public testing.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

Some people also asked OpenAI employees whether the model itself has not changed, but whether the inference parameters have changed?

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

Qubits also accidentally "tortured" that the system prompt words of ChatGPT on iOS were not consistent with the web version.

  • If you start a conversation on your mobile phone, it will know that it is interacting with you through your mobile phone.
  • Will keep the answer to one or two sentences, unless a long reasoning is required.
  • will not use emoticons unless you explicitly ask him to use them.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

△It may not be successful, and there is a high probability of refusing to answer

Then if you continue in the web version, open it in the iOS version dialogue without realizing it, you may observe that GPT-4 answers become simpler.

In short, it is still an unsolved mystery whether GPT-4 has become dumber since its release.

But one thing is certain:

The GPT-4 that everyone started playing on March 14th was not as good as the one in the paper from the beginning.

Aligning with humans reduces AI capabilities

The more than 150-page paper published by Microsoft Research "The Spark of AGI: Early Experiments with GPT-4" clearly states:

They obtained testing qualifications before the development of GPT-4 was completed and conducted long-term testing.

Later, for many amazing examples in the paper, netizens were unable to successfully reproduce them using the public version of GPT-4.

There is currently a view in the academic community that although the subsequent RLHF training made GPT-4 more aligned with humans - that is, more obedient to human instructions and consistent with human values ​​- it also allowed it to use its own reasoning, etc. Ability becomes worse.

One of the authors of the paper, Microsoft scientist Zhang Yi, also mentioned in the S7E11 issue of the Chinese podcast program "What's Next|Technology Knows Early":

That version of the model is better than the current one. GPT-4, which is available to everyone, is even stronger, much stronger.

For example, the Microsoft team mentioned in the paper that they let GPT-4 use TikZ in LaTeX to draw a unicorn at regular intervals to track changes in GPT-4 capabilities. .

The last result shown in the paper is quite complete.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

But the first author of the paper, Sebastien Bubeck, later revealed more information when he gave a speech at MIT.

Later, when OpenAI began to focus on security issues, subsequent versions became increasingly worse at this task.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

Training methods that are aligned with humans but do not reduce the upper limit of AI's own capabilities have become the research direction of many teams now, but Still in its infancy.

In addition to professional research teams, netizens who care about AI are also using their own methods to track changes in AI capabilities.

Someone asked GPT-4 to draw a unicorn once a day and record it publicly on the website.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

Since April 12, I still haven’t seen the general shape of a unicorn.

GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.

Of course, the website author said that he let GPT-4 use SVG format to draw pictures, which is different from the TikZ format in the paper and has an impact.

And what I drew in April seems to be just as bad as what I draw now, and there is no obvious regression.

Finally, let me ask you, are you a GPT-4 user? Have you felt that GPT-4's capabilities have declined in recent weeks? Welcome to chat in the comment area.

Bubeck’s speech: https://www.php.cn/link/a8a5d22acb383aae55937a6936e120b0
Zhang Yi’s interview: https://www.php.cn/link/ 764f9642ebf04622c53ebc366a68c0a7
One GPT-4 unicorn every dayhttps://www.php.cn/link/7610db9e380ba9775b3c215346184a87

Reference link:
[1]https://www.php.cn/link/cd3e48b4bce1f295bd8ed1eb90eb0d85
[2]https://www.php.cn/link/fc2dc7d20994a777cfd5e6de734fe254
[3]https://www.php.cn/link/4dcfbc057e2ae8589f9bbd98b591c50a
[4]https://www.php.cn/link/0007cda84fafdcf42f96c4f4adb7f8ce
[5]https://www.php.cn/link/cd163419a5f4df0ba7e252841f95fcc1
[6]https://www.php.cn/link/afb0b97df87090596ae7c503f60bb23f
[7]https://www.php.cn/link/ef8f94395be9fd78b7d0aecf7864a03
[8]https://www.php.cn/link/30082754836bf11b2c31a0fd3cb4b091
[9]https://www.php.cn/link/14553eed6ae802daf3f8e8c10b1961f0



##

The above is the detailed content of GPT-4 becomes stupid and triggers public opinion! The quality of text code has declined, and OpenAI has just responded to questions about cost reduction and material reduction.. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to solve win7 driver code 28 How to solve win7 driver code 28 Dec 30, 2023 pm 11:55 PM

Some users encountered errors when installing the device, prompting error code 28. In fact, this is mainly due to the driver. We only need to solve the problem of win7 driver code 28. Let’s take a look at what should be done. Do it. What to do with win7 driver code 28: First, we need to click on the start menu in the lower left corner of the screen. Then, find and click the "Control Panel" option in the pop-up menu. This option is usually located at or near the bottom of the menu. After clicking, the system will automatically open the control panel interface. In the control panel, we can perform various system settings and management operations. This is the first step in the nostalgia cleaning level, I hope it helps. Then we need to proceed and enter the system and

The world's most powerful open source MoE model is here, with Chinese capabilities comparable to GPT-4, and the price is only nearly one percent of GPT-4-Turbo The world's most powerful open source MoE model is here, with Chinese capabilities comparable to GPT-4, and the price is only nearly one percent of GPT-4-Turbo May 07, 2024 pm 04:13 PM

Imagine an artificial intelligence model that not only has the ability to surpass traditional computing, but also achieves more efficient performance at a lower cost. This is not science fiction, DeepSeek-V2[1], the world’s most powerful open source MoE model is here. DeepSeek-V2 is a powerful mixture of experts (MoE) language model with the characteristics of economical training and efficient inference. It consists of 236B parameters, 21B of which are used to activate each marker. Compared with DeepSeek67B, DeepSeek-V2 has stronger performance, while saving 42.5% of training costs, reducing KV cache by 93.3%, and increasing the maximum generation throughput to 5.76 times. DeepSeek is a company exploring general artificial intelligence

What to do if the blue screen code 0x0000001 occurs What to do if the blue screen code 0x0000001 occurs Feb 23, 2024 am 08:09 AM

What to do with blue screen code 0x0000001? The blue screen error is a warning mechanism when there is a problem with the computer system or hardware. Code 0x0000001 usually indicates a hardware or driver failure. When users suddenly encounter a blue screen error while using their computer, they may feel panicked and at a loss. Fortunately, most blue screen errors can be troubleshooted and dealt with with a few simple steps. This article will introduce readers to some methods to solve the blue screen error code 0x0000001. First, when encountering a blue screen error, we can try to restart

The second generation Ameca is here! He can communicate with the audience fluently, his facial expressions are more realistic, and he can speak dozens of languages. The second generation Ameca is here! He can communicate with the audience fluently, his facial expressions are more realistic, and he can speak dozens of languages. Mar 04, 2024 am 09:10 AM

The humanoid robot Ameca has been upgraded to the second generation! Recently, at the World Mobile Communications Conference MWC2024, the world's most advanced robot Ameca appeared again. Around the venue, Ameca attracted a large number of spectators. With the blessing of GPT-4, Ameca can respond to various problems in real time. "Let's have a dance." When asked if she had emotions, Ameca responded with a series of facial expressions that looked very lifelike. Just a few days ago, EngineeredArts, the British robotics company behind Ameca, just demonstrated the team’s latest development results. In the video, the robot Ameca has visual capabilities and can see and describe the entire room and specific objects. The most amazing thing is that she can also

750,000 rounds of one-on-one battle between large models, GPT-4 won the championship, and Llama 3 ranked fifth 750,000 rounds of one-on-one battle between large models, GPT-4 won the championship, and Llama 3 ranked fifth Apr 23, 2024 pm 03:28 PM

Regarding Llama3, new test results have been released - the large model evaluation community LMSYS released a large model ranking list. Llama3 ranked fifth, and tied for first place with GPT-4 in the English category. The picture is different from other benchmarks. This list is based on one-on-one battles between models, and the evaluators from all over the network make their own propositions and scores. In the end, Llama3 ranked fifth on the list, followed by three different versions of GPT-4 and Claude3 Super Cup Opus. In the English single list, Llama3 overtook Claude and tied with GPT-4. Regarding this result, Meta’s chief scientist LeCun was very happy and forwarded the tweet and

The computer frequently blue screens and the code is different every time The computer frequently blue screens and the code is different every time Jan 06, 2024 pm 10:53 PM

The win10 system is a very excellent high-intelligence system. Its powerful intelligence can bring the best user experience to users. Under normal circumstances, users’ win10 system computers will not have any problems! However, it is inevitable that various faults will occur in excellent computers. Recently, friends have been reporting that their win10 systems have encountered frequent blue screens! Today, the editor will bring you solutions to different codes that cause frequent blue screens in Windows 10 computers. Let’s take a look. Solutions to frequent computer blue screens with different codes each time: causes of various fault codes and solution suggestions 1. Cause of 0×000000116 fault: It should be that the graphics card driver is incompatible. Solution: It is recommended to replace the original manufacturer's driver. 2,

Resolve code 0xc000007b error Resolve code 0xc000007b error Feb 18, 2024 pm 07:34 PM

Termination Code 0xc000007b While using your computer, you sometimes encounter various problems and error codes. Among them, the termination code is the most disturbing, especially the termination code 0xc000007b. This code indicates that an application cannot start properly, causing inconvenience to the user. First, let’s understand the meaning of termination code 0xc000007b. This code is a Windows operating system error code that usually occurs when a 32-bit application tries to run on a 64-bit operating system. It means it should

Detailed explanation of the causes and solutions of 0x0000007f blue screen code Detailed explanation of the causes and solutions of 0x0000007f blue screen code Dec 25, 2023 pm 02:19 PM

Blue screen is a problem we often encounter when using the system. Depending on the error code, there will be many different reasons and solutions. For example, when we encounter the problem of stop: 0x0000007f, it may be a hardware or software error. Let’s follow the editor to find out the solution. 0x000000c5 blue screen code reason: Answer: The memory, CPU, and graphics card are suddenly overclocked, or the software is running incorrectly. Solution 1: 1. Keep pressing F8 to enter when booting, select safe mode, and press Enter to enter. 2. After entering safe mode, press win+r to open the run window, enter cmd, and press Enter. 3. In the command prompt window, enter "chkdsk /f /r", press Enter, and then press the y key. 4.

See all articles