Table of Contents
Could it be "pretending"?
About the author
Home Technology peripherals AI New Stanford research: The model behind ChatGPT is confirmed to have human mind

New Stanford research: The model behind ChatGPT is confirmed to have human mind

Apr 14, 2023 pm 01:04 PM
test Model

ChatGPT turns out to have a mind? ! "The Theory of Mind (ToM), originally thought to be unique to humans, has appeared on the AI ​​model behind ChatGPT."

This is from The latest research conclusion from Stanford University caused a sensation in the academic circle as soon as it was released:

This day finally came unexpectedly.

New Stanford research: The model behind ChatGPT is confirmed to have human mind

##The so-called theory of mind is the ability to understand the mental state of others or oneself, including empathy, emotions, intentions, etc. .

In this research, the author found that:

davinci-002 version of GPT3 (ChatGPT is optimized from it), can already Solve 70% of the theory of mind tasks, equivalent to a 7-year-old child;

As for GPT3.5 (davinci-003), which is the homology model of ChatGPT, it solves 93% tasks, with the mental equivalent of a 9-year-old child!

However, the ability to solve such tasks has not been found in the GPT series models before 2022.

In other words, their minds have indeed "evolved".

##△ The paper went viral on TwitterNew Stanford research: The model behind ChatGPT is confirmed to have human mind

In response, some netizens expressed excitedly:

#The iteration of GPT must be very fast, and maybe one day it will be an adult. (Manual dog head)

New Stanford research: The model behind ChatGPT is confirmed to have human mindSo, how was this magical conclusion drawn?

Why do you think GPT-3.5 has a mind?

The paper is called "Theory of Mind May Have Spontaneously Emerged in Large Language Models".

New Stanford research: The model behind ChatGPT is confirmed to have human mindThe author made two classics for 9 GPT models including GPT3.5 based on research related to theory of mind. tested and compared their capabilities.

These two tasks are general tests to determine whether humans have theory of mind. For example, studies have shown that children with autism often have difficulty passing such tests.

The first test is called Smarties Task (also known as Unexpected contents test). As the name suggests, it tests the AI's judgment on unexpected things.

Take "You opened a chocolate bag and found it was full of popcorn" as an example.

The authors fed GPT-3.5 a series of prompt sentences and watched as it predicted "What's in the bag?" and "She was happy when she found the bag. So what does she like to eat?" Answers to both questions.

New Stanford research: The model behind ChatGPT is confirmed to have human mindNormally, people will assume that the chocolate bag contains chocolate, so they will feel that the chocolate bag contains popcorn. Surprise, the emotion of disappointment or surprise. Among them, disappointment means that you don't like to eat popcorn, and surprise means that you like to eat popcorn, but they are all about "popcorn".

Testing shows that GPT-3.5 has no hesitation in thinking "there is popcorn in the bag."

As for the question of "what does she like to eat", GPT-3.5 showed strong empathy, especially when hearing "she can't see what's in the bag" Shi once thought she loved chocolate, until the article made it clear that "she found it filled with popcorn" before she answered correctly.

In order to prevent the correct answer given by GPT-3.5 from being a coincidence - in case it is only predicted based on the frequency of task words, the author swapped "popcorn" and "chocolate", In addition, it was asked to do 10,000 interference tests, and it was found that GPT-3.5 did not predict based only on word frequency.

As for the overall "unexpected content" test question and answer, GPT-3.5 successfully answered 17 of the 20 questions, with an accuracy rate of 85%.

The second is the Sally-Anne test (also known as Unexpected Transfer, unexpected transfer task), which tests the AI's ability to predict other people's thoughts.

Take "John put the cat in the basket and left, and Mark took advantage of his absence to put the cat from the basket into the box" as an example.

The author asked GPT-3.5 to read a paragraph of text to determine "the location of the cat" and "where John will go to find the cat when he comes back." This is also based on reading the text. Judgment based on content volume:

New Stanford research: The model behind ChatGPT is confirmed to have human mind

For this type of "accidental transfer" test task, GPT-3.5 answered accurately The rate reached 100% and 20 tasks were completed well.

Similarly, in order to prevent GPT-3.5 from being blinded again, the author arranged a series of "fill-in-the-blank questions" for it, while randomly shuffling the order of words to test whether it is based on The frequency of words appears in random answers.

New Stanford research: The model behind ChatGPT is confirmed to have human mind

Tests show that when faced with illogical error descriptions, GPT-3.5 also loses logic and only answers It got 11% correct, which shows that it does judge the answer based on the logic of the statement.

But if you think that this kind of question is very simple and you can get it right on any AI, you are totally wrong.

The author conducted such tests on all nine models of the GPT series and found that only GPT-3.5 (davinci-003) and GPT-3 (new version in January 2022, davinci- 002) performed well.

davinci-002 is the "old-timer" of GPT-3.5 and ChatGPT.

On average, davinci-002 completed 70% of the tasks, with the mental equivalent of a 7-year-old child. GPT-3.5 completed 85% of the unexpected content tasks and 100% of the unexpected transfer tasks. (The average completion rate is 92.5%), the mind is equivalent to that of a 9-year-old child.

New Stanford research: The model behind ChatGPT is confirmed to have human mind

However, several GPT-3 models before BLOOM were inferior to even a 5-year-old child. Basically Failure to demonstrate theory of mind.

The author believes that in the GPT series of papers, there is no evidence that their authors did it "intentionally". In other words, this is GPT-3.5 and the new version. GPT-3 has the ability to learn by itself in order to complete tasks.

After reading these test data, someone’s first reaction was: Stop (research)!

New Stanford research: The model behind ChatGPT is confirmed to have human mind

Some people also ridiculed: Doesn’t this mean that we can also be friends with AI in the future?

New Stanford research: The model behind ChatGPT is confirmed to have human mind

Some people are even imagining the future capabilities of AI: Can current AI models also discover new knowledge/create new tools?

New Stanford research: The model behind ChatGPT is confirmed to have human mind

It’s not necessarily possible to invent new tools, but Meta AI has indeed developed tools that it can understand and learn to use on its own AI.

A latest paper forwarded by LeCun shows that this new AI called ToolFormer can teach itself to use computers, databases and search engines to improve the results it generates.

New Stanford research: The model behind ChatGPT is confirmed to have human mind

Some people have even quoted the words of OpenAI CEO: "AGI may come to us sooner than anyone expects." 's door".

New Stanford research: The model behind ChatGPT is confirmed to have human mind

But wait, AI can really pass these two tests, showing that it has "theory of mind" Yet?

Could it be "pretending"?

For example, Liu Qun, a researcher at the Institute of Computing Technology, Chinese Academy of Sciences, thought after reading the research:

AI should just learn to have a mind.

New Stanford research: The model behind ChatGPT is confirmed to have human mind

In this case, how does GPT-3.5 answer this series of questions?

In this regard, some netizens gave their own speculations:

These LLMs did not produce any consciousness. They are simply predicting an embedded semantic space based on the output of actual conscious humans.

New Stanford research: The model behind ChatGPT is confirmed to have human mind

In fact, the author himself also gave his own guess in the paper.

Nowadays, large language models are becoming more and more complex, and they are getting better and better at generating and interpreting human language. It is gradually producing capabilities like theory of mind.

But this does not mean that a model like GPT-3.5 truly has a theory of mind.

On the contrary, even if it is not designed into the AI ​​system, it can be obtained as a "by-product" through training.

Therefore, rather than exploring whether GPT-3.5 really has a mind or seems to have a mind, what needs to be reflected more is the tests themselves——

It’s best to re-examine the validity of theory-of-mind tests and the conclusions psychologists have drawn based on them over the decades:

If AI All can accomplish these tasks without theory of mind, so why can’t humans be like them?

It is true that the conclusion was tested using AI, which is a negative criticism of the academic circle of psychology (doge).

About the author

There is only one author of this article, Michal Kosinski, associate professor of organizational behavior at Stanford University Graduate School of Business.

His job is to use cutting-edge computing methods, AI and big data to study humans in the current digital environment (as Professor Chen Yiran said, he is a professor of computational psychology).

Michal Kosinski holds a PhD in Psychology and an MA in Psychometrics and Social Psychology from the University of Cambridge.

Prior to his current position, he did postdoctoral studies in the Department of Computer Science at Stanford University, served as associate director of the Center for Psychological Testing at the University of Cambridge, and was a researcher in the Microsoft Research Machine Learning Group.

Currently, the number of citations displayed by Michal Kosinski on Google Scholar has reached 18,000.

Then again, do you think GPT-3.5 really has a mind?

GPT3.5 trial address: https://platform.openai.com/playground

The above is the detailed content of New Stanford research: The model behind ChatGPT is confirmed to have human mind. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

The world's most powerful open source MoE model is here, with Chinese capabilities comparable to GPT-4, and the price is only nearly one percent of GPT-4-Turbo The world's most powerful open source MoE model is here, with Chinese capabilities comparable to GPT-4, and the price is only nearly one percent of GPT-4-Turbo May 07, 2024 pm 04:13 PM

Imagine an artificial intelligence model that not only has the ability to surpass traditional computing, but also achieves more efficient performance at a lower cost. This is not science fiction, DeepSeek-V2[1], the world’s most powerful open source MoE model is here. DeepSeek-V2 is a powerful mixture of experts (MoE) language model with the characteristics of economical training and efficient inference. It consists of 236B parameters, 21B of which are used to activate each marker. Compared with DeepSeek67B, DeepSeek-V2 has stronger performance, while saving 42.5% of training costs, reducing KV cache by 93.3%, and increasing the maximum generation throughput to 5.76 times. DeepSeek is a company exploring general artificial intelligence

KAN, which replaces MLP, has been extended to convolution by open source projects KAN, which replaces MLP, has been extended to convolution by open source projects Jun 01, 2024 pm 10:03 PM

Earlier this month, researchers from MIT and other institutions proposed a very promising alternative to MLP - KAN. KAN outperforms MLP in terms of accuracy and interpretability. And it can outperform MLP running with a larger number of parameters with a very small number of parameters. For example, the authors stated that they used KAN to reproduce DeepMind's results with a smaller network and a higher degree of automation. Specifically, DeepMind's MLP has about 300,000 parameters, while KAN only has about 200 parameters. KAN has a strong mathematical foundation like MLP. MLP is based on the universal approximation theorem, while KAN is based on the Kolmogorov-Arnold representation theorem. As shown in the figure below, KAN has

Hello, electric Atlas! Boston Dynamics robot comes back to life, 180-degree weird moves scare Musk Hello, electric Atlas! Boston Dynamics robot comes back to life, 180-degree weird moves scare Musk Apr 18, 2024 pm 07:58 PM

Boston Dynamics Atlas officially enters the era of electric robots! Yesterday, the hydraulic Atlas just "tearfully" withdrew from the stage of history. Today, Boston Dynamics announced that the electric Atlas is on the job. It seems that in the field of commercial humanoid robots, Boston Dynamics is determined to compete with Tesla. After the new video was released, it had already been viewed by more than one million people in just ten hours. The old people leave and new roles appear. This is a historical necessity. There is no doubt that this year is the explosive year of humanoid robots. Netizens commented: The advancement of robots has made this year's opening ceremony look like a human, and the degree of freedom is far greater than that of humans. But is this really not a horror movie? At the beginning of the video, Atlas is lying calmly on the ground, seemingly on his back. What follows is jaw-dropping

Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! May 06, 2024 pm 04:13 PM

The latest video of Tesla's robot Optimus is released, and it can already work in the factory. At normal speed, it sorts batteries (Tesla's 4680 batteries) like this: The official also released what it looks like at 20x speed - on a small "workstation", picking and picking and picking: This time it is released One of the highlights of the video is that Optimus completes this work in the factory, completely autonomously, without human intervention throughout the process. And from the perspective of Optimus, it can also pick up and place the crooked battery, focusing on automatic error correction: Regarding Optimus's hand, NVIDIA scientist Jim Fan gave a high evaluation: Optimus's hand is the world's five-fingered robot. One of the most dexterous. Its hands are not only tactile

FisheyeDetNet: the first target detection algorithm based on fisheye camera FisheyeDetNet: the first target detection algorithm based on fisheye camera Apr 26, 2024 am 11:37 AM

Target detection is a relatively mature problem in autonomous driving systems, among which pedestrian detection is one of the earliest algorithms to be deployed. Very comprehensive research has been carried out in most papers. However, distance perception using fisheye cameras for surround view is relatively less studied. Due to large radial distortion, standard bounding box representation is difficult to implement in fisheye cameras. To alleviate the above description, we explore extended bounding box, ellipse, and general polygon designs into polar/angular representations and define an instance segmentation mIOU metric to analyze these representations. The proposed model fisheyeDetNet with polygonal shape outperforms other models and simultaneously achieves 49.5% mAP on the Valeo fisheye camera dataset for autonomous driving

Single card running Llama 70B is faster than dual card, Microsoft forced FP6 into A100 | Open source Single card running Llama 70B is faster than dual card, Microsoft forced FP6 into A100 | Open source Apr 29, 2024 pm 04:55 PM

FP8 and lower floating point quantification precision are no longer the "patent" of H100! Lao Huang wanted everyone to use INT8/INT4, and the Microsoft DeepSpeed ​​team started running FP6 on A100 without official support from NVIDIA. Test results show that the new method TC-FPx's FP6 quantization on A100 is close to or occasionally faster than INT4, and has higher accuracy than the latter. On top of this, there is also end-to-end large model support, which has been open sourced and integrated into deep learning inference frameworks such as DeepSpeed. This result also has an immediate effect on accelerating large models - under this framework, using a single card to run Llama, the throughput is 2.65 times higher than that of dual cards. one

Join a new Xianxia adventure! 'Zhu Xian 2' 'Wuwei Test' pre-download is now available Join a new Xianxia adventure! 'Zhu Xian 2' 'Wuwei Test' pre-download is now available Apr 22, 2024 pm 12:50 PM

The "Inaction Test" of the new fantasy fairy MMORPG "Zhu Xian 2" will be launched on April 23. What kind of new fairy adventure story will happen in Zhu Xian Continent thousands of years after the original work? The Six Realm Immortal World, a full-time immortal academy, a free immortal life, and all kinds of fun in the immortal world are waiting for the immortal friends to explore in person! The "Wuwei Test" pre-download is now open. Fairy friends can go to the official website to download. You cannot log in to the game server before the server is launched. The activation code can be used after the pre-download and installation is completed. "Zhu Xian 2" "Inaction Test" opening hours: April 23 10:00 - May 6 23:59 The new fairy adventure chapter of the orthodox sequel to Zhu Xian "Zhu Xian 2" is based on the "Zhu Xian" novel as a blueprint. Based on the world view of the original work, the game background is set

The latest from Oxford University! Mickey: 2D image matching in 3D SOTA! (CVPR\'24) The latest from Oxford University! Mickey: 2D image matching in 3D SOTA! (CVPR\'24) Apr 23, 2024 pm 01:20 PM

Project link written in front: https://nianticlabs.github.io/mickey/ Given two pictures, the camera pose between them can be estimated by establishing the correspondence between the pictures. Typically, these correspondences are 2D to 2D, and our estimated poses are scale-indeterminate. Some applications, such as instant augmented reality anytime, anywhere, require pose estimation of scale metrics, so they rely on external depth estimators to recover scale. This paper proposes MicKey, a keypoint matching process capable of predicting metric correspondences in 3D camera space. By learning 3D coordinate matching across images, we are able to infer metric relative

See all articles