Table of Contents
#1. Scale remains an important factor" >#1. Scale remains an important factor
2. Unsupervised learning continues to deliver" >2. Unsupervised learning continues to deliver
3. Multimodality makes great strides" >3. Multimodality makes great strides
4. Fundamental issues in deep learning remain" >4. Fundamental issues in deep learning remain
Home Technology peripherals AI Development trends and issues of deep learning in 2022

Development trends and issues of deep learning in 2022

Apr 12, 2023 pm 09:55 PM
AI deep learning development trend

We put behind us another year of exciting developments in artificial intelligence (AI) deep learning—a year filled with notable advances, controversy, and, of course, controversy. As we wrap up 2022 and prepare to welcome 2023, here are the most notable overall trends in deep learning this year.

Development trends and issues of deep learning in 2022

#1. Scale remains an important factor

One theme that has remained constant in deep learning over the past few years is the creation The driving force for larger neural networks. The availability of computer resources enables the development of scalable neural networks as well as specialized AI hardware, large data sets, and scale-friendly architectures such as transformer models.

Currently, companies are getting better results by scaling neural networks to larger scales. In the past year, DeepMind released Gopher, a large language model (LLM) with 280 billion parameters; Google released the Pathways language model (PaLM) with 540 billion parameters and the general language model (GLaM) with up to 1.2 trillion parameters. ); Microsoft and NVIDIA released Megatron-Turing NLG, a 530 billion parameter LLM.

One of the interesting aspects of scale is the ability to emerge, where larger models successfully accomplish tasks that would be impossible for smaller models. This phenomenon is particularly interesting in LLMs, where as the scale increases, the models show promising results on a wider range of tasks and benchmarks.

However, it is worth noting that even in the largest models, some fundamental problems of deep learning remain unresolved (more on this later).

2. Unsupervised learning continues to deliver

Many successful deep learning applications require humans to label training examples, also known as supervised learning. But most data available on the internet does not come with the clean labels required for supervised learning. Data annotation is expensive and slow, creating bottlenecks. That's why researchers have long sought advances in unsupervised learning, in which deep learning models are trained without human-annotated data.

This field has made tremendous progress in recent years, especially in the field of LLMs, which are mostly trained on large raw data sets collected from the Internet. While the LL.M. continues to gain ground in 2022, we are also seeing other trends in unsupervised learning techniques gaining in popularity.

For example, text-to-image models have made amazing progress this year. Models such as OpenAI’s DALL-E 2, Google’s Imagen, and Stability AI’s Stable Diffusion demonstrate the power of unsupervised learning. Unlike older text-to-image models that require well-annotated image and description pairs, these models use large datasets of loosely captioned images that already exist on the Internet. The sheer size of their training dataset (which is only possible because no manual labeling is required) and the variability of the subtitle schemes enable these models to find a variety of complex patterns between textual and visual information. Therefore, they are more flexible in generating images for various descriptions.

3. Multimodality makes great strides

Text-to-image generators have another interesting feature: they combine multiple data types in a single model . Being able to handle multiple patterns enables deep learning models to take on more complex tasks.

Multimodality is very important for human and animal intelligence. For example, when you see a tree and hear the wind rustling in its branches, your brain can quickly connect them. Likewise, when you see the word "tree," you can quickly conjure up an image of a tree, remember the smell of pine trees after it rains, or recall other experiences you've had before.

Obviously, multimodality plays an important role in making deep learning systems more flexible. This is perhaps best demonstrated by DeepMind’s Gato, a deep learning model trained on a variety of data types, including images, text, and proprioceptive data. Gato excels at multiple tasks, including image captioning, interactive dialogue, controlling robotic arms, and playing games. This is in contrast to classic deep learning models that are designed to perform a single task.

Some researchers have advanced the concept that we only need systems like Gato to implement artificial intelligence (AGI). Although many scientists disagree with this view, it is certain that multimodality has brought important achievements to deep learning.

4. Fundamental issues in deep learning remain

Despite the impressive achievements of deep learning, some issues in the field remain unresolved. These include causation, compositionality, common sense, reasoning, planning, intuitive physics, and abstraction and analogy.

These are some of the mysteries of intelligence that are still being studied by scientists in different fields. Purely scale- and data-based deep learning approaches have helped make incremental progress on some of these problems, but have failed to provide clear solutions.

​For example, a larger LLM can maintain coherence and consistency across longer texts. But they failed at tasks that required careful step-by-step reasoning and planning.

Similarly, text-to-image generators create stunning graphics but make basic mistakes when asked to draw images that require composition or have complex descriptions.

These challenges are being discussed and explored by various scientists, including some pioneers of deep learning. The most famous of these is Yann LeCun, the Turing Award-winning inventor of convolutional neural networks (CNN), who recently wrote a lengthy article about the limitations of LLMs that learn only from text. LeCun is working on a deep learning architecture that can learn a model of the world and could solve some of the challenges currently facing the field.

Deep learning has come a long way. But the more progress we make, the more we realize the challenges of creating truly intelligent systems. Next year will definitely be as exciting as this year.

The above is the detailed content of Development trends and issues of deep learning in 2022. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Bytedance Cutting launches SVIP super membership: 499 yuan for continuous annual subscription, providing a variety of AI functions Bytedance Cutting launches SVIP super membership: 499 yuan for continuous annual subscription, providing a variety of AI functions Jun 28, 2024 am 03:51 AM

This site reported on June 27 that Jianying is a video editing software developed by FaceMeng Technology, a subsidiary of ByteDance. It relies on the Douyin platform and basically produces short video content for users of the platform. It is compatible with iOS, Android, and Windows. , MacOS and other operating systems. Jianying officially announced the upgrade of its membership system and launched a new SVIP, which includes a variety of AI black technologies, such as intelligent translation, intelligent highlighting, intelligent packaging, digital human synthesis, etc. In terms of price, the monthly fee for clipping SVIP is 79 yuan, the annual fee is 599 yuan (note on this site: equivalent to 49.9 yuan per month), the continuous monthly subscription is 59 yuan per month, and the continuous annual subscription is 499 yuan per year (equivalent to 41.6 yuan per month) . In addition, the cut official also stated that in order to improve the user experience, those who have subscribed to the original VIP

Context-augmented AI coding assistant using Rag and Sem-Rag Context-augmented AI coding assistant using Rag and Sem-Rag Jun 10, 2024 am 11:08 AM

Improve developer productivity, efficiency, and accuracy by incorporating retrieval-enhanced generation and semantic memory into AI coding assistants. Translated from EnhancingAICodingAssistantswithContextUsingRAGandSEM-RAG, author JanakiramMSV. While basic AI programming assistants are naturally helpful, they often fail to provide the most relevant and correct code suggestions because they rely on a general understanding of the software language and the most common patterns of writing software. The code generated by these coding assistants is suitable for solving the problems they are responsible for solving, but often does not conform to the coding standards, conventions and styles of the individual teams. This often results in suggestions that need to be modified or refined in order for the code to be accepted into the application

Can fine-tuning really allow LLM to learn new things: introducing new knowledge may make the model produce more hallucinations Can fine-tuning really allow LLM to learn new things: introducing new knowledge may make the model produce more hallucinations Jun 11, 2024 pm 03:57 PM

Large Language Models (LLMs) are trained on huge text databases, where they acquire large amounts of real-world knowledge. This knowledge is embedded into their parameters and can then be used when needed. The knowledge of these models is "reified" at the end of training. At the end of pre-training, the model actually stops learning. Align or fine-tune the model to learn how to leverage this knowledge and respond more naturally to user questions. But sometimes model knowledge is not enough, and although the model can access external content through RAG, it is considered beneficial to adapt the model to new domains through fine-tuning. This fine-tuning is performed using input from human annotators or other LLM creations, where the model encounters additional real-world knowledge and integrates it

Seven Cool GenAI & LLM Technical Interview Questions Seven Cool GenAI & LLM Technical Interview Questions Jun 07, 2024 am 10:06 AM

To learn more about AIGC, please visit: 51CTOAI.x Community https://www.51cto.com/aigc/Translator|Jingyan Reviewer|Chonglou is different from the traditional question bank that can be seen everywhere on the Internet. These questions It requires thinking outside the box. Large Language Models (LLMs) are increasingly important in the fields of data science, generative artificial intelligence (GenAI), and artificial intelligence. These complex algorithms enhance human skills and drive efficiency and innovation in many industries, becoming the key for companies to remain competitive. LLM has a wide range of applications. It can be used in fields such as natural language processing, text generation, speech recognition and recommendation systems. By learning from large amounts of data, LLM is able to generate text

Five schools of machine learning you don't know about Five schools of machine learning you don't know about Jun 05, 2024 pm 08:51 PM

Machine learning is an important branch of artificial intelligence that gives computers the ability to learn from data and improve their capabilities without being explicitly programmed. Machine learning has a wide range of applications in various fields, from image recognition and natural language processing to recommendation systems and fraud detection, and it is changing the way we live. There are many different methods and theories in the field of machine learning, among which the five most influential methods are called the "Five Schools of Machine Learning". The five major schools are the symbolic school, the connectionist school, the evolutionary school, the Bayesian school and the analogy school. 1. Symbolism, also known as symbolism, emphasizes the use of symbols for logical reasoning and expression of knowledge. This school of thought believes that learning is a process of reverse deduction, through existing

To provide a new scientific and complex question answering benchmark and evaluation system for large models, UNSW, Argonne, University of Chicago and other institutions jointly launched the SciQAG framework To provide a new scientific and complex question answering benchmark and evaluation system for large models, UNSW, Argonne, University of Chicago and other institutions jointly launched the SciQAG framework Jul 25, 2024 am 06:42 AM

Editor |ScienceAI Question Answering (QA) data set plays a vital role in promoting natural language processing (NLP) research. High-quality QA data sets can not only be used to fine-tune models, but also effectively evaluate the capabilities of large language models (LLM), especially the ability to understand and reason about scientific knowledge. Although there are currently many scientific QA data sets covering medicine, chemistry, biology and other fields, these data sets still have some shortcomings. First, the data form is relatively simple, most of which are multiple-choice questions. They are easy to evaluate, but limit the model's answer selection range and cannot fully test the model's ability to answer scientific questions. In contrast, open-ended Q&A

AlphaFold 3 is launched, comprehensively predicting the interactions and structures of proteins and all living molecules, with far greater accuracy than ever before AlphaFold 3 is launched, comprehensively predicting the interactions and structures of proteins and all living molecules, with far greater accuracy than ever before Jul 16, 2024 am 12:08 AM

Editor | Radish Skin Since the release of the powerful AlphaFold2 in 2021, scientists have been using protein structure prediction models to map various protein structures within cells, discover drugs, and draw a "cosmic map" of every known protein interaction. . Just now, Google DeepMind released the AlphaFold3 model, which can perform joint structure predictions for complexes including proteins, nucleic acids, small molecules, ions and modified residues. The accuracy of AlphaFold3 has been significantly improved compared to many dedicated tools in the past (protein-ligand interaction, protein-nucleic acid interaction, antibody-antigen prediction). This shows that within a single unified deep learning framework, it is possible to achieve

SOTA performance, Xiamen multi-modal protein-ligand affinity prediction AI method, combines molecular surface information for the first time SOTA performance, Xiamen multi-modal protein-ligand affinity prediction AI method, combines molecular surface information for the first time Jul 17, 2024 pm 06:37 PM

Editor | KX In the field of drug research and development, accurately and effectively predicting the binding affinity of proteins and ligands is crucial for drug screening and optimization. However, current studies do not take into account the important role of molecular surface information in protein-ligand interactions. Based on this, researchers from Xiamen University proposed a novel multi-modal feature extraction (MFE) framework, which for the first time combines information on protein surface, 3D structure and sequence, and uses a cross-attention mechanism to compare different modalities. feature alignment. Experimental results demonstrate that this method achieves state-of-the-art performance in predicting protein-ligand binding affinities. Furthermore, ablation studies demonstrate the effectiveness and necessity of protein surface information and multimodal feature alignment within this framework. Related research begins with "S

See all articles