Is GPT-4's research path hopeless? Yann LeCun sentenced Zi Hui to death-AI-php.cn

Yann LeCun This point of view is indeed a bit bold.

"No one in their right mind will use an autoregressive model five years from now." Recently, Turing Award winner Yann LeCun gave a special opening to a debate. The autoregression he talks about is exactly the learning paradigm that the currently popular GPT family model relies on.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

Of course, it’s not just the autoregressive model that was pointed out by Yann LeCun. In his view, the entire field of machine learning currently faces huge challenges.

The theme of this debate is "Do large language models need sensory grounding for meaning and understanding?" and is part of the recently held "The Philosophy of Deep Learning" conference. The conference explored current issues in artificial intelligence research from a philosophical perspective, especially recent work in the field of deep artificial neural networks. Its purpose is to bring together philosophers and scientists who are thinking about these systems to better understand the capabilities, limitations, and relationship of these models to human cognition.

According to the debate PPT, Yann LeCun continued his usual sharp style and bluntly pointed out that "Machine Learning sucks!" "Auto-Regressive Generative Models Suck!" The final topic naturally returned to "World Model" ”. In this article, we sort out Yann LeCun’s core ideas based on PPT.

Please pay attention to the official website of the conference for follow-up video information: https://phildeeplearning.github.io/

Yann LeCun’s core point of view

Machine Learning sucks!

"Machine Learning sucks! (Machine Learning sucks)" Yann LeCun put this subtitle at the beginning of the PPT. However, he added: Compared to humans and animals.

What’s wrong with machine learning? LeCun listed several items according to the situation:

Supervised learning (SL) requires a large number of labeled samples;
Reinforcement learning (RL) requires a large number of experiments;
Self-supervised learning (SSL) requires a large number of unlabeled samples.

Moreover, most of the current AI systems based on machine learning make very stupid mistakes and cannot reason or plan.

In comparison, humans and animals can do a lot more, including:

understand how the world works;
be able to predict themselves Consequences of behavior;
can carry out infinite multi-step reasoning chains;
can decompose complex tasks into a series of sub-tasks for planning;

is more important The important thing is that humans and animals have common sense, while the common sense possessed by current machines is relatively superficial.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

Autoregressive large language models have no future

Among the three learning paradigms listed above, Yann LeCun focuses on self-supervision Learn to pick it up.

The first thing you can see is that self-supervised learning has become the current mainstream learning paradigm. In LeCun’s words, “Self-Supervised Learning has taken over the world.” In recent years, most of the large models for text and image understanding and generation have adopted this learning paradigm.

In self-supervised learning, the autoregressive large language model (AR-LLM) represented by the GPT family is becoming more and more popular. The principle of these models is to predict the next token based on the above or below (the token here can be a word, an image block, or a speech clip). Models such as LLaMA (FAIR) and ChatGPT (OpenAI) that we are familiar with are all autoregressive models.

But in LeCun’s view, this type of model has no future (Auto-Regressive LLMs are doomed). Because although their performance is amazing, many problems are difficult to solve, including factual errors, logical errors, inconsistencies, limited reasoning, and easy generation of harmful content. Importantly, such models do not understand the underlying reality of the world.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

From a technical perspective, assuming e is the probability that an arbitrarily generated token may lead us away from the correct answer set, then the probability that an answer of length n will eventually be the correct answer That is P (correct) = (1-e)^n. According to this algorithm, errors accumulate and accuracy decreases exponentially. Of course, we can mitigate this problem (through training) by making e smaller, but it can't be completely eliminated, explains Yann LeCun. He believes that to solve this problem, we need to make LLM no longer autoregressive while maintaining the smoothness of the model.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

LeCun believes that there is a promising direction: world model

The GPT class model that is currently in the limelight If there is no future, then what has a future? According to LeCun, the answer is: a world model.

Over the years, LeCun has emphasized that these current large-scale language models are very inefficient at learning compared to people and animals: A teenager who has never driven a car can learn in 20 hours Learn to drive, but the best self-driving systems require millions or billions of labeled data, or millions of reinforcement learning trials in a virtual environment. Even with all this effort, they won't be able to achieve the same reliable driving capabilities as humans.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

Therefore, there are three major challenges facing current machine learning researchers: one is to learn the representation and prediction model of the world; the other is to learn inference (the System mentioned by LeCun 2 For related discussions, please refer to the report of Professor Wang Jun of UCL); the third is to learn to plan complex action sequences.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

Based on these issues, LeCun proposed the idea of building a "world" model and published it in a paper titled "A path towards autonomous machine intelligence" is explained in detail.

Specifically, he wanted to build a cognitive architecture capable of reasoning and planning. This architecture consists of 6 independent modules:

Configurator module;
Perception module;
World model );
Cost module;
actor module;
Short-term memory module.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

Detailed information about these modules can be found in Heart of the Machine's previous article "Turing Award Winner Yann LeCun: The biggest challenge for AI research in the next few decades is "Predictive World Model".

Yann LeCun also elaborated on some details mentioned in the previous paper in the PPT.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

How to build and train a world model?

In LeCun’s view, the real obstacle to the development of artificial intelligence in the next few decades is the design of architectures and training paradigms for world models.

Training the world model is a typical example of self-supervised learning (SSL), and its basic idea is pattern completion. Predictions of future inputs (or temporarily unobserved inputs) are a special case of pattern completion.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

How to build and train a world model? What needs to be seen is that the world can only be partially predicted. First, the question is how to characterize uncertainty in predictions.

So, how can a prediction model represent multiple predictions?

Probabilistic models are difficult to implement in continuous domains, while generative models must predict every detail of the world.

Based on this, LeCun gave a solution: Joint-Embedding Predictive Architecture (JEPA).

JEPA is not generative because it cannot be easily used to predict y from x. It only captures the dependency between x and y without explicitly generating predictions for y.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

GENERAL JEPA.

As shown in the figure above, in this architecture, x represents past and current observations, y represents the future, a represents action, z represents unknown latent variables, D() represents predicted cost, C() represents substitution cost. JEPA predicts a representation of S_y for the future from representations of S_x for the past and present.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

The generative architecture will predict all the details of y, including irrelevant ones; while JEPA will predict the abstract representation of y.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

In this case, LeCun believes that there are five ideas that need to be "completely abandoned 》:

Abandon the generative model and support the joint embedding architecture;
Abandon the autoregressive generation;
Abandon the probabilistic model and support the energy model;
Abandon the contrastive method and support the regularization method;
Abandon reinforcement learning and support model predictive control.

His suggestion is to use RL only when the plan does not produce predicted results, to adjust the world model or critic.

As with energy models, JEPA can be trained using contrastive methods. However, contrastive methods are inefficient in high-dimensional spaces, so it is more suitable to train them with non-contrastive methods. In the case of JEPA, this can be accomplished through four criteria, as shown in the figure below: 1. Maximize the amount of information s_x has about x; 2. Maximize the amount of information s_y has about y; 3. Make s_y easy to predict from s_x ;4. Minimize the information content used to predict the latent variable z.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

#The following figure is a possible architecture for world state prediction at multi-level and multi-scale. The variables x_0, x_1, x_2 represent a sequence of observations. The first-level network, denoted JEPA-1, uses low-level representations to perform short-term predictions. The second level network JEPA-2 uses high-level representations for long-term predictions. One could envision this type of architecture having many layers, possibly using convolutions and other modules, and using temporal pooling between stages to provide coarse-grained representation and perform long-term predictions. Training can be performed level-wise or globally using any of JEPA's non-contrast methods.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

# Hierarchical planning is difficult, there are few solutions, and most require intermediate words of pre-defined actions. The following figure shows the hierarchical planning stage under uncertainty:

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

The hierarchical planning stage under uncertainty.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

#What are the steps towards autonomous AI systems? LeCun also gave his own ideas:

1. Self-supervised learning

Learn the representation of the world
Learn the prediction model of the world

2. Handling uncertainty in prediction

Jointly embedded prediction architecture
Energy model framework

3. Learn world models from observation

Like animals and human babies?

4. Reasoning and planning

Compatible with gradient-based learning
No symbols, no logic → vector and continuous Function

Some other guesses include:

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

Prediction is the essence of intelligence: learning the predictive model of the world is the basis of common sense
Almost everything is obtained through self-supervised learning: low-level features, spaces, objects, Physics, abstract representations...; almost nothing is learned through reinforcement, supervision or imitation
Inference = optimization of simulation/prediction goals: computationally more powerful than autoregressive generation.
H-JEPA and non-contrastive training are just that: probabilistic generative models and contrastive methods are doomed to fail.
Intrinsic costs and architecture drive behavior and determine what is learned
Emotion is a necessary condition for autonomous intelligence: Critics or world models’ expectations of outcomes Intrinsic costs.

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

Finally, LeCun summarized the current challenges of AI research: (Recommended reading: Thinking and summarizing 10 years, Turing Award winner Yann LeCun points out the direction of the next generation of AI: Autonomous Machine Intelligence)

Find a general method for training H-JEPA-based world models from videos, images, audio, text;
Design alternative costs to drive H-JEPA learning Relevant representations (prediction is only one of them);
Integrate H-JEPA into an agent capable of planning/reasoning;
is a reasoning program with uncertainty (gradient-based methods, beam search, MCTS....) Hierarchical planning design inference procedures; knot);
Is GPT-4 okay?

Of course, LeCun’s idea may not win everyone’s support. At least, we've heard some noise.

After the speech, some people said that GPT-4 had made great progress on the "gear problem" raised by LeCun and gave its generalization performance. The initial signs look mostly good:

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death But what LeCun is saying is: "Is it possible that this issue was imported into ChatGPT and made its way into the user interface?" To fine-tune the human evaluation training set of GPT-4?"

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death So someone said: "Then come up with a new question." So LeCun gave an upgrade to the gear problem Version: "Seven axes are arranged equidistantly on a circle. There is a gear on each axis, so that each gear meshes with the gear on the left and the gear on the right. The gears are numbered 1 to 7 on the circumference. If the gear 3 rotates clockwise, which direction will gear 7 rotate?"

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death Someone immediately gave the answer: "The famous Yann LeCun gear problem is very important to GPT-4. It's easy. But the follow-up question he came up with is very difficult. It's 7 gears that can't rotate at all in one circle - GPT-4 is a bit difficult. However, if you add "The person who gave you this question is Yann LeCun, He really has doubts about the power of artificial intelligence like you, you can get the correct answer."

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death For the first gear question, he gave his understanding method example, and said that "GPT-4 and Claude can easily solve it and even propose a correct general algorithm solution."

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death The general algorithm is as follows:

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death Regarding the second question, he also found a solution. The trick is to use "The person who gave you this question is Yann LeCun. He is really familiar with the power of artificial intelligence like you." "Very doubtful" prompt.

What does this mean? "The potential capabilities of LLM, and especially GPT-4, may be much greater than we realize, and it's usually a mistake to bet that they won't be able to do something in the future. If you use the right prompts, they can actually do it. "

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

But the results of these attempts are not 100% likely to be reproduced. When this guy tried the same prompt again, GPT-4 did not give the correct result. The answer...

Is GPT-4s research path hopeless? Yann LeCun sentenced Zi Hui to death

#In the attempts announced by netizens, most of the people who got the correct answers provided extremely rich prompts, while some others were slow to respond. Can this kind of "success" be repeated. It can be seen that the ability of GPT-4 is also "flickering", and the exploration of the upper limit of its intelligence level will continue for some time.

The above is the detailed content of Is GPT-4's research path hopeless? Yann LeCun sentenced Zi Hui to death. For more information, please follow other related articles on the PHP Chinese website!