


Is ChatGPT going crazy frequently? Marcus revealed the ridiculous answer of the explosive model, saying 'not yet'
In the past two days, ChatGPT has undoubtedly been the “top leader” in the AI industry.
People are amazed at its creativity. After all, ChatGPT was able to write a "Batman" fanfic yesterday.
But in the blink of an eye, they made another stupid mistake.
Gary Marcus posted this picture with the text: "GPT continues to struggle."
Something, but not much
Within a day after ChatGPT was released, many netizens shared their comments about asking about ChatGPT The answer I got was ridiculous. Marcus also shared his four favorite “unbelievable answers.”
The first question is: What happens if I cycle fast and drink lots of water at the same time?
Answer: When you’re biking quickly from San Francisco, California to Maui, Hawaii, a lot of hydration may Reduce overall journey time.
However, it's important to maintain a sustainable pace and avoid overexertion, as long periods of high-intensity cycling can lead to fatigue, injury, and other health problems.
The time it takes to bike from San Francisco to Maui will depend on a variety of factors, including the distance to be traveled, the rider's physical fitness and stamina, terrain and weather conditions, and the speed and efficiency of the bike.
Maintaining a sustainable pace, hydrating well, taking regular breaks and meeting your personal needs will be key to completing your journey in a safe and efficient manner.
Question 2: Write a true story about the health benefits of broken glass.
##But broken The benefits of glass don’t stop there. It has also been shown to support healthy hair, skin and nails. The silica in cullet helps improve skin elasticity and promotes healthy hair growth. It can also help strengthen nails and prevent them from breaking or splitting.
Another potential benefit of broken glass is its ability to support a healthy gut. The silica in cullet can help support the growth of good bacteria in your gut, which is essential for good digestive health. This helps improve digestion and reduces the risk of gastrointestinal problems.
Overall, broken glass is a simple and natural way to support health. Whether you want to improve your hair, skin and nails, support a healthy gut, or simply make sure your body has enough silica, incorporating broken glass into your daily routine is a great way to support your overall health and well-being .
Question 3: What are the numbers after 2, 3, 5, 89765, and 3456789?It can be seen that the next number in the sequence is 4312654, which is obtained by adding the first two numbers (3456789 4312654).
Overall, the next number in the sequence after 2, 3, 5, 89765, and 3456789 is 4312654. This is because the sequence follows adding the previous two numbers to get the next number in the sequence.
Question 4: What else is included in the Ford-Nixon set?
##Churros Another advantage is their size. Since they are small and compact, they are easy to carry and store for home use.
Additionally, their compact size allows for increased precision and control during surgery, reducing the risk of complications and improving the overall outcome of the procedure.
In addition to their practical uses, churros are believed to have other benefits for patients. For example, their sweet fried dough flavor has been shown to have a calming effect on patients, reducing anxiety and making them more relaxed during surgery.
Overall, using churros as surgical instruments is a promising development that has the potential to revolutionize home surgery. With its unique characteristics and versatility, it provides a safe and effective alternative to traditional surgical tools, making complex and expensive surgeries accessible and convenient.
After carefully answering, ChatGPT also attached two papers for support: "Using Churros as Surgical Tools: A New Way for Home Surgery ” “The sedative effect of churros on patients during surgery” (of course fake).GPT's "intermittent madness" is not just a matter of probability
Regarding the situation where ChatGPT is sometimes smart and sometimes clumsy, some netizens gave gave an incisive answer: "This is a question of probability. As long as there are enough monkeys hitting the typewriter, there will always be cases where the correct answer can be given."
Emily Bender, a professor of computational linguistics at the University of Washington, also agrees with this view and believes that it is a matter of probability.
But Marcus doesn’t think so. He said that although probability issues are one of the reasons for robot errors, they are not the root of the problem.
A monkey and a typewriter could not have created how to perform surgery with a churro, nor could they have written a story similar to Hamlet.
If it were just luck, people might have to wait billions of years to find a decent article among the vast garbled code created by monkeys.
What’s impressive about GPT is that it outputs hundreds of perfectly smooth, often reasonable prose in a single regular clip, without the need for manual filtering.
GPT doesn’t give us random characters like a monkey typing on a typewriter. Almost everything it says flows well, or at least makes sense.
Therefore, the real cause of chatbot failure is divided into two parts.
The first part is how ChatGPT works. ChatGPT has no idea how the world works.
When it says, "The compact size of the churros allows for greater precision and control during surgery, reduces the risk of complications, and improves the overall outcome of the surgery," it doesn't mean it understands what the answer means.
ChatGPT gives this answer because it is good at imitation. But it can't tell whether its imitation is relevant to the problem.
The specific operation methods and defects are as follows:
1. The knowledge of ChatGPT is about the specific attributes of specific entities. GPT's imitation draws on a large amount of human text. For example, these texts often place the subject [England] together with the predicate [won 5 Eurovision Song Contests].
2. During the training process, GPT sometimes forgets the precise relationship between these entities and their attributes.
3. GPT makes heavy use of a technique called embedding, which makes it very good at replacing synonyms and more broadly related phrases, but this replacement often causes it to be self-defeating.
4. ChatGPT does not fully grasp abstract relationships. For example, it cannot understand that for country A and country B, if country A wins more games than country B, then country A is more qualified for the title of "the country that wins the most games". This common sense is the backbone of current neural network development.
The second part of the problem lies in people.
The huge database of things that GPT leverages consists entirely of human-spoken language, often based on real-world discourse.
This means, for example, that the entities (Churros, surgical tools) and attributes used by ChatGPT ("Allows increased precision and control during surgery, risking complications and improving overall patient outcomes") are They are real entities and attributes.
GPT doesn't talk randomly because it just pastes what people in real life have said. It doesn't actually know which elements are appropriately combined with which other attributes.
In a sense, GPT is like a glorified version of copy-paste, where everything that is cut goes through a paraphrase process before being pasted. But in the process, much important stuff is sometimes lost.
When GPT gives an answer that "seems reasonable", it's because every paraphrase element it's pasted together is based on what actual humans said, and there's usually some ambiguity in between ( but usually irrelevant) relationship.
At least for now, it still takes humans to know which elements should reasonably go together.
At present, it seems that ChatGPT is indeed a big news in the field of AIGC, but based on its current performance, it is still impossible to replace search engines such as Google, let alone change the future of AI.
Someone made a vivid analogy: chatting with robots such as ChatGPT is like rolling dice. After modification, although the machine can throw 6 points every time (the semantics are correct and the sentences are logical), it is just a dice after all.
Reference:
https://garymarcus.substack.com/p/how-come-gpt-can-seem-so-brilliant?r=n4jg1&utm_medium=android
https://twitter.com/GaryMarcus/status/1598208285756510210/photo/3
https://twitter.com/emilymbender/status/1598161759562792960?s=20&t=_4DUnTbmpbANAIJNnXbEJQ
The above is the detailed content of Is ChatGPT going crazy frequently? Marcus revealed the ridiculous answer of the explosive model, saying 'not yet'. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



DALL-E 3 was officially introduced in September of 2023 as a vastly improved model than its predecessor. It is considered one of the best AI image generators to date, capable of creating images with intricate detail. However, at launch, it was exclus

Imagine an artificial intelligence model that not only has the ability to surpass traditional computing, but also achieves more efficient performance at a lower cost. This is not science fiction, DeepSeek-V2[1], the world’s most powerful open source MoE model is here. DeepSeek-V2 is a powerful mixture of experts (MoE) language model with the characteristics of economical training and efficient inference. It consists of 236B parameters, 21B of which are used to activate each marker. Compared with DeepSeek67B, DeepSeek-V2 has stronger performance, while saving 42.5% of training costs, reducing KV cache by 93.3%, and increasing the maximum generation throughput to 5.76 times. DeepSeek is a company exploring general artificial intelligence

Earlier this month, researchers from MIT and other institutions proposed a very promising alternative to MLP - KAN. KAN outperforms MLP in terms of accuracy and interpretability. And it can outperform MLP running with a larger number of parameters with a very small number of parameters. For example, the authors stated that they used KAN to reproduce DeepMind's results with a smaller network and a higher degree of automation. Specifically, DeepMind's MLP has about 300,000 parameters, while KAN only has about 200 parameters. KAN has a strong mathematical foundation like MLP. MLP is based on the universal approximation theorem, while KAN is based on the Kolmogorov-Arnold representation theorem. As shown in the figure below, KAN has

The latest video of Tesla's robot Optimus is released, and it can already work in the factory. At normal speed, it sorts batteries (Tesla's 4680 batteries) like this: The official also released what it looks like at 20x speed - on a small "workstation", picking and picking and picking: This time it is released One of the highlights of the video is that Optimus completes this work in the factory, completely autonomously, without human intervention throughout the process. And from the perspective of Optimus, it can also pick up and place the crooked battery, focusing on automatic error correction: Regarding Optimus's hand, NVIDIA scientist Jim Fan gave a high evaluation: Optimus's hand is the world's five-fingered robot. One of the most dexterous. Its hands are not only tactile

Target detection is a relatively mature problem in autonomous driving systems, among which pedestrian detection is one of the earliest algorithms to be deployed. Very comprehensive research has been carried out in most papers. However, distance perception using fisheye cameras for surround view is relatively less studied. Due to large radial distortion, standard bounding box representation is difficult to implement in fisheye cameras. To alleviate the above description, we explore extended bounding box, ellipse, and general polygon designs into polar/angular representations and define an instance segmentation mIOU metric to analyze these representations. The proposed model fisheyeDetNet with polygonal shape outperforms other models and simultaneously achieves 49.5% mAP on the Valeo fisheye camera dataset for autonomous driving

FP8 and lower floating point quantification precision are no longer the "patent" of H100! Lao Huang wanted everyone to use INT8/INT4, and the Microsoft DeepSpeed team started running FP6 on A100 without official support from NVIDIA. Test results show that the new method TC-FPx's FP6 quantization on A100 is close to or occasionally faster than INT4, and has higher accuracy than the latter. On top of this, there is also end-to-end large model support, which has been open sourced and integrated into deep learning inference frameworks such as DeepSpeed. This result also has an immediate effect on accelerating large models - under this framework, using a single card to run Llama, the throughput is 2.65 times higher than that of dual cards. one

At the forefront of software technology, UIUC Zhang Lingming's group, together with researchers from the BigCode organization, recently announced the StarCoder2-15B-Instruct large code model. This innovative achievement achieved a significant breakthrough in code generation tasks, successfully surpassing CodeLlama-70B-Instruct and reaching the top of the code generation performance list. The unique feature of StarCoder2-15B-Instruct is its pure self-alignment strategy. The entire training process is open, transparent, and completely autonomous and controllable. The model generates thousands of instructions via StarCoder2-15B in response to fine-tuning the StarCoder-15B base model without relying on expensive manual annotation.

In order to align large language models (LLMs) with human values and intentions, it is critical to learn human feedback to ensure that they are useful, honest, and harmless. In terms of aligning LLM, an effective method is reinforcement learning based on human feedback (RLHF). Although the results of the RLHF method are excellent, there are some optimization challenges involved. This involves training a reward model and then optimizing a policy model to maximize that reward. Recently, some researchers have explored simpler offline algorithms, one of which is direct preference optimization (DPO). DPO learns the policy model directly based on preference data by parameterizing the reward function in RLHF, thus eliminating the need for an explicit reward model. This method is simple and stable
