Many AI application models now have to mention a model structure:
Transformer.
It abandons traditional CNN and RNN and is entirely composed of Attention mechanism.
Transformer not only gives various AI application models the ability to write articles and poems, but also shines in multi-modal aspects.
Especially after the release of ViT (Vision Transformer), the model barrier between CV and NLP has been broken, and only one Transformer model can handle multi-modal tasks.
(No one can say how powerful it is after reading this)
Although Transformer was originally designed for language tasks, it also has great potential in imitating the brain.
No, a science writer wrote a blog about how Transformer models the brain.
What did he say when he came to Kangkang?
First of all, we have to sort out its evolution process.
The Transformer mechanism first appeared 5 years ago. Its ability to perform so powerfully is largely due to its Self-attention mechanism.
As for how Transformer imitates the brain, continue reading.
In 2020, the research team of Austrian computer scientist Sepp Hochreiter used Transformer to reorganize the Hopfield neural network (a memory retrieval model, HNN).
In fact, the Hopfield neural network was proposed 40 years ago, and the reason why the research team chose to reorganize this model after decades is as follows:
First, this network follows a A general rule: Neurons that are active at the same time establish strong connections with each other.
Second, the Hopfield neural network has certain similarities in the process of retrieving memory and the Transformer's implementation of the Self-attention mechanism.
So the research team reorganized HNN to establish better connections between neurons so that more memories can be stored and retrieved.
The process of reorganization, simply put, is to integrate the attention mechanism of Transformer into HNN, so that the original discontinuous HNN becomes a continuous state.
The reorganized Hopfield network can be integrated into a deep learning architecture as a layer to allow storage and access of the original input data , intermediate results, etc.
Therefore, Hopfield himself and Dmitry Krotov of the MIT Watson Artificial Intelligence Laboratory both said:
The Hopfield neural network based on Transformer is biologically reasonable.
Although this is similar to how the brain works to a certain extent, it is not accurate enough in some aspects.
So computational neuroscientists Whittington and Behrens adapted Hochreiter's method and made some modifications to the reorganized Hopfield network, further improving the model's performance in neuroscience tasks such as replicating neural firing patterns in the brain. )Performance.
To put it simply, during encoding-decoding, the model no longer Instead of encoding memories as linear sequences, encode them as coordinates in a high-dimensional space.
Specifically, TEM (Tolman-Eichenbaum Machine) is introduced into the model.
TEM is an associative memory system built to imitate the spatial navigation function of the hippocampus.
It is able to generalize spatial and non-spatial structural knowledge, predict neuronal performance observed in spatial and associative memory tasks, and explain remapping phenomena in the hippocampus and entorhinal cortex.
Merge TEM and Transformer, which have so many functions, to form TEM-transformer (TEM-t).
Then, let the TEM-t model be trained in multiple different spatial environments. The structure of the environment is as shown in the figure below.
In TEM-t, it still has the Self-attention mechanism of Transformer. In this way, the model's learning results can be transferred to new environments and used to predict new spatial structures.
Research also shows that compared to TEM, TEM-t is more efficient in performing neuroscience tasks, and it can also handle more problems with fewer learning samples.
Transformer is getting deeper and deeper in imitating brain patterns. In other words, the development of Transformer patterns is also constantly promoting our understanding of the operating principles of brain functions.
Not only that, in some aspects, Transformer can also improve our understanding of other functions of the brain.
For example, last year, computational neuroscientist Martin Schrimpf analyzed 43 different neural network models to observe their effects on human neural activity measurements: Functional Magnetism Predictive power of resonance imaging (fMRI) and cortical electroencephalography (EEG) reports.
Among them, the Transformer model can predict almost all changes found in imaging.
Looking back, perhaps we can also predict the operation of the corresponding functions of the brain from the Transformer model.
In addition, computer scientists Yujin Tang and David Ha recently designed a model that can consciously send large amounts of data in a random and disordered manner through the Transformer model, simulating how the human body transmits sensory observations to the brain. .
This Transformer is like the human brain and can successfully process disordered information flow.
Although the Transformer model continues to improve, it is only a small step towards an accurate brain model, and more in-depth research is needed to reach the end.
If you want to learn more about how Transformer imitates the human brain, you can click on the link below~
[1]https://www.quantamagazine.org /how-ai-transformers-mimic-parts-of-the-brain-20220912/
[2]https://www.pnas.org/doi/10.1073/pnas.2105646118
[3]https://openreview.net/forum?id=B8DVo9B1YE0
The above is the detailed content of Transformer imitates the brain, surpassing 42 models in predicting brain imaging, and can also simulate the transmission between the senses and the brain. For more information, please follow other related articles on the PHP Chinese website!