We put behind us another year of exciting developments in artificial intelligence (AI) deep learning—a year filled with notable advances, controversy, and, of course, controversy. As we wrap up 2022 and prepare to welcome 2023, here are the most notable overall trends in deep learning this year.
One theme that has remained constant in deep learning over the past few years is the creation The driving force for larger neural networks. The availability of computer resources enables the development of scalable neural networks as well as specialized AI hardware, large data sets, and scale-friendly architectures such as transformer models.
Currently, companies are getting better results by scaling neural networks to larger scales. In the past year, DeepMind released Gopher, a large language model (LLM) with 280 billion parameters; Google released the Pathways language model (PaLM) with 540 billion parameters and the general language model (GLaM) with up to 1.2 trillion parameters. ); Microsoft and NVIDIA released Megatron-Turing NLG, a 530 billion parameter LLM.
One of the interesting aspects of scale is the ability to emerge, where larger models successfully accomplish tasks that would be impossible for smaller models. This phenomenon is particularly interesting in LLMs, where as the scale increases, the models show promising results on a wider range of tasks and benchmarks.
However, it is worth noting that even in the largest models, some fundamental problems of deep learning remain unresolved (more on this later).
Many successful deep learning applications require humans to label training examples, also known as supervised learning. But most data available on the internet does not come with the clean labels required for supervised learning. Data annotation is expensive and slow, creating bottlenecks. That's why researchers have long sought advances in unsupervised learning, in which deep learning models are trained without human-annotated data.
This field has made tremendous progress in recent years, especially in the field of LLMs, which are mostly trained on large raw data sets collected from the Internet. While the LL.M. continues to gain ground in 2022, we are also seeing other trends in unsupervised learning techniques gaining in popularity.
For example, text-to-image models have made amazing progress this year. Models such as OpenAI’s DALL-E 2, Google’s Imagen, and Stability AI’s Stable Diffusion demonstrate the power of unsupervised learning. Unlike older text-to-image models that require well-annotated image and description pairs, these models use large datasets of loosely captioned images that already exist on the Internet. The sheer size of their training dataset (which is only possible because no manual labeling is required) and the variability of the subtitle schemes enable these models to find a variety of complex patterns between textual and visual information. Therefore, they are more flexible in generating images for various descriptions.
Text-to-image generators have another interesting feature: they combine multiple data types in a single model . Being able to handle multiple patterns enables deep learning models to take on more complex tasks.
Multimodality is very important for human and animal intelligence. For example, when you see a tree and hear the wind rustling in its branches, your brain can quickly connect them. Likewise, when you see the word "tree," you can quickly conjure up an image of a tree, remember the smell of pine trees after it rains, or recall other experiences you've had before.
Obviously, multimodality plays an important role in making deep learning systems more flexible. This is perhaps best demonstrated by DeepMind’s Gato, a deep learning model trained on a variety of data types, including images, text, and proprioceptive data. Gato excels at multiple tasks, including image captioning, interactive dialogue, controlling robotic arms, and playing games. This is in contrast to classic deep learning models that are designed to perform a single task.
Some researchers have advanced the concept that we only need systems like Gato to implement artificial intelligence (AGI). Although many scientists disagree with this view, it is certain that multimodality has brought important achievements to deep learning.
Despite the impressive achievements of deep learning, some issues in the field remain unresolved. These include causation, compositionality, common sense, reasoning, planning, intuitive physics, and abstraction and analogy.
These are some of the mysteries of intelligence that are still being studied by scientists in different fields. Purely scale- and data-based deep learning approaches have helped make incremental progress on some of these problems, but have failed to provide clear solutions.
For example, a larger LLM can maintain coherence and consistency across longer texts. But they failed at tasks that required careful step-by-step reasoning and planning.
Similarly, text-to-image generators create stunning graphics but make basic mistakes when asked to draw images that require composition or have complex descriptions.
These challenges are being discussed and explored by various scientists, including some pioneers of deep learning. The most famous of these is Yann LeCun, the Turing Award-winning inventor of convolutional neural networks (CNN), who recently wrote a lengthy article about the limitations of LLMs that learn only from text. LeCun is working on a deep learning architecture that can learn a model of the world and could solve some of the challenges currently facing the field.
Deep learning has come a long way. But the more progress we make, the more we realize the challenges of creating truly intelligent systems. Next year will definitely be as exciting as this year.
The above is the detailed content of Development trends and issues of deep learning in 2022. For more information, please follow other related articles on the PHP Chinese website!