#Whether a machine can think about this question is like asking whether a submarine can swim. ——Dijkstra
Long before the release of ChatGPT, the industry had already smelled the changes brought about by large models.
On October 14 last year, professors Melanie Mitchell and David C. Krakauer of the Santa Fe Institute published a review on arXiv, comprehensively investigating all the Regarding the debate on whether large-scale pre-trained language models can understand language, the article describes the arguments for and against, as well as the key issues in the broader intelligence science derived from these arguments.
Paper link: https://arxiv.org/pdf/2210.13966.pdf
Published journal: "Academy of the National Academy of Sciences Newspaper" (PNAS)
Too long to read:
The main argument in support of "understanding" is that large language models can complete many seemingly Tasks that require understanding before they can be completed.
The main argument against "understanding" is that from a human perspective, the understanding of large language models is very fragile, such as being unable to understand subtle changes between prompts; and language models do not have real-world life experience To validate their knowledge, multimodal language models may alleviate this problem.
The most critical problem is that no one has a reliable definition of "what is understanding" yet, and they don't know how to test the understanding ability of language models for human The test is not necessarily suitable for testing the understanding of large language models.
In short, large language models can understand language, but perhaps in a different way than humans.
Researchers believe that a new science of intelligence can be developed that deeply studies different types of understanding, finds out the advantages and limitations of different understanding modes, and at the same time integrates the results produced by different forms of understanding. Cognitive differences.
# Melanie Mitchell, the first author of the paper, is a professor at the Santa Fe Institute. She received her Ph.D. in 1990 Graduated from the University of Michigan, her mentors were Hofstadter (author of "Gödel, Escher, Bach: A Collection of Different Masters") and John Holland. Her main research directions are analogical reasoning, complex systems, genetic algorithms and cells. Automata.
"What is understanding" has always puzzled philosophers, cognitive scientists and educators. Researchers often use humans or other animals as a reference for "understanding ability".
Until recently, with the rise of large-scale artificial intelligence systems, especially the emergence of large language models (LLM), a fierce debate has been set off in the artificial intelligence community, that is, now Can it be said that machines have been able to understand natural language and thus understand the physical and social situations described by language.
This is not a purely academic debate. The degree and way in which machines understand the world will have an impact on the extent to which humans can trust AI to perform tasks such as driving cars, diagnosing diseases, caring for the elderly, and educating children, so that in humans Take strong and transparent action on relevant tasks.
The current debate shows that there are some differences in how the academic community thinks about understanding in intelligent systems, especially in mental models that rely on "statistical correlation" and "causal mechanisms" (mental models), the differences are more obvious.
However, there is still a general consensus in the artificial intelligence research community about machine understanding, that is, although artificial intelligence systems exhibit seemingly intelligent behavior in many specific tasks, they are not Understand the data they process like humans do.
For example, facial recognition software does not understand that the face is part of the body, nor does it understand the role of facial expressions in social interactions, nor does it understand how humans act in nearly infinite ways. To use the facial concept.
Similarly, speech-to-text and machine translation programs don’t understand the language they process, and self-driving systems don’t understand the subtle eye contact or body language that drivers and pedestrians use to avoid accidents.
In fact, the oft-cited brittleness of these AI systems—unpredictable errors and lack of robust generalization—is a key metric in assessing AI understanding. .
Over the past few years, the audience and influence of large language models (LLMs) in the field of artificial intelligence have surged, and it has also changed some people's views on the prospects of machine understanding of language.
Large-scale pre-training models, also called foundation models, are deep neural networks with billions to trillions of parameters (weights). They are used in massive natural language corpora (including online texts, online books, etc.) Obtained after performing "pre-training" on.
The task of the model during training is to predict the missing parts of the input sentence, so this method is also called "self-supervised learning", and the resulting network is a complex statistical model , you can get how the words and phrases in the training data are related to each other.
This model can be used to generate natural language and fine-tuned for specific natural language tasks, or further trained to better match "user intent", but for non-professionals How exactly language models accomplish these tasks remains a mystery to scientists.
The inner workings of neural networks are largely opaque, and even the researchers who build them have limited intuition for systems of this scale.
## Neuroscientist Terrence Sejnowski describes the emergence of LLMs like this:After crossing a certain threshold, it’s as if aliens suddenly appear and can communicate with us in a terrifying, human way. Only one thing is clear at the moment, large language models are not human, some aspects of their behavior appear to be intelligent, but if not human intelligence, what is the nature of their intelligence?Support the understanding camp VS Oppose the understanding campAlthough the performance of large language models is shocking, the most advanced LLMs are still susceptible to brittleness and non-human error. However, it can be observed that the network performance has improved significantly with the expansion of its number of parameters and the size of the training corpus, which has also led some researchers in this field to claim that as long as there is a large enough The network and training data set, the language model (a multi-modal version) and perhaps the multi-modal version - will lead to human-level intelligence and understanding. A new artificial intelligence slogan has emerged: Scale is all you need! This statement also reflects the debate about large language models in the artificial intelligence research community: One group believes that language models can truly understand language , and can reason in a general way (although not yet to a human level). For example, Google’s LaMDA system is pre-trained on text and then fine-tuned on conversational tasks, enabling it to hold conversations with users in a very wide range of domains. The other school believes that large pre-trained models like GPT-3 or LaMDA, regardless of their No matter how fluent the language output is, it cannot possess understanding because these models have no practical experience and no mental model of the world. Language models are only trained to predict words in large text collections so that they learn the form of language, far from learning the meaning behind the language. A system trained solely on language will never come close to human intelligence, even if it is trained from now on until the death of the universe. It’s clear that these systems are destined to only shallow levels of understanding and will never come close to the full-body thinking we see in humans. Another scholar believes that when talking about these systems, the understanding of intelligence, agents, and by extension is wrong, and that language models are actually compressed libraries of human knowledge. More akin to a library or encyclopedia than an agent. For example, humans know what "itching" means to make us laugh because we have bodies; a language model can use the word "itching" but it obviously has not experienced this feeling. Understanding "itch" maps one word to a feeling, not to another word. Those on the “LLMs don’t understand” side argue that while the fluency of large language models is surprising, our surprise reflects our lack of confidence in the statistical correlations in these models There is a lack of intuition about what can be generated on a scale. A report from a 2022 survey of active researchers in the natural language processing community shows clear divisions in this debate. When 480 respondents were asked whether they agreed with the statement that LLMs can in principle understand language, that is, "generative language models trained only on text, as long as there are enough With data and computing resources, natural language can be understood in some sense." The survey results were evenly divided, with half (51%) agreeing and the other half (49%) disagreeing.
While both sides of the “LLM understanding” debate have ample intuition to support their respective views, there are currently available cognitive science-based insights into understanding. method is not sufficient to answer such questions about LLM.
In fact, some researchers have applied psychological tests (originally designed to assess human understanding and reasoning mechanisms) to LLMs and found that in some cases, LLMs do indeed think theoretically Demonstrated human-like responses on tests and human-like abilities and biases on reasoning assessments.
While these tests are considered reliable agents for assessing human generalization abilities, this may not be the case for artificial intelligence systems.
Large language models have a special ability to learn correlations between their training data and tokens in the input, and can use this correlation to solve problems; in contrast, humans Use condensed concepts that reflect their real-world experiences.
When applying tests designed for humans to LLMs, interpretation of the results may rely on assumptions about human cognition that may simply not be true for these models .
To make progress, scientists will need to develop new benchmarks and detection methods to understand the mechanisms of different types of intelligence and understanding, including the new forms of “bizarre” intelligence we have created. , mind-like entities", and some related work has already been done.
As models become larger and more capable systems are developed, the debate over understanding in LLMs highlights the need to "expand our science of intelligence" , so that "understanding" is meaningful, whether for humans or machines.
Neuroscientist Terrence Sejnowski points out that experts’ differing opinions on the intelligence of LLMs show that our old ideas based on natural intelligence are not enough.
If LLMs and related models can succeed by exploiting statistical correlations on an unprecedented scale, perhaps they can be considered a "new form of understanding", one that can achieve extraordinary forms of superhuman prediction capabilities, such as DeepMind's AlphaZero and AlphaFold systems, which bring an "exotic" form of intuition to the fields of chess playing and protein structure prediction respectively.
So it can be said that in recent years, the field of artificial intelligence has created machines with new modes of understanding, most likely a completely new concept, as we pursue the elusive goal of intelligence. As progress is made in essential aspects, these new concepts will continue to be enriched.
Problems that require extensive coding knowledge and have high performance requirements will continue to promote the development of large-scale statistical models, while those with limited knowledge and strong causal mechanisms will have Conducive to understanding human intelligence.
The challenge for the future is to develop new scientific methods to reveal the detailed understanding of different forms of intelligence, discern their strengths and limitations, and learn how to integrate these truly different cognitions model.
References:
https://www.pnas.org/doi/10.1073/pnas.2215907120
The above is the detailed content of What a noise! Does ChatGPT understand the language? PNAS: Let's first study what 'understanding” is. For more information, please follow other related articles on the PHP Chinese website!