Paper: Recent Advances in Deep Learning: An Overview
##Paper address: https://arxiv. org/pdf/1807.08169v1.pdf Abstract: Deep learning is one of the latest trends in machine learning and artificial intelligence research. It is also one of the most popular scientific research trends today. Deep learning methods have brought revolutionary advances in computer vision and machine learning. New deep learning techniques are constantly being created, surpassing state-of-the-art machine learning and even existing deep learning techniques. In recent years, many major breakthroughs have been made in this field around the world. Due to the rapid development of deep learning, its progress is difficult to follow, especially for new researchers. In this article, we will briefly discuss recent advances in deep learning in recent years. 1. Introduction The term "deep learning" (DL) was first introduced into machine learning (ML) in 1986, and later in 2000 was used in artificial neural networks (ANN). Deep learning methods consist of multiple layers to learn data features with multiple levels of abstraction. DL methods allow computers to learn complex concepts through relatively simple concepts. For artificial neural networks (ANN), deep learning (DL) (also known as hierarchical learning) refers to the precise allocation of credit across multiple computational stages to transform aggregate activations in the network. To learn complex functions, deep architectures are used at multiple levels of abstraction, i.e. non-linear operations; such as ANNs, with many hidden layers. To summarize in accurate words, deep learning is a subfield of machine learning that uses multiple levels of nonlinear information processing and abstraction for supervised or unsupervised feature learning, representation, classification, and pattern recognition. Deep learning, or representation learning, is a branch or subfield of machine learning. Most people believe that modern deep learning methods were developed starting in 2006. This article is a review of the latest deep learning technology and is mainly recommended to researchers who are about to get involved in this field. This article includes the basic ideas, main methods, latest developments, and applications of DL. Review papers are very beneficial, especially to new researchers in a particular field. If a research field has great value in the near future and related application fields, it is usually difficult to track the latest progress in real time. Scientific research is an attractive career these days because knowledge and education are easier to share and obtain than ever before. The only normal assumption for a technology research trend is that it will see many improvements in all aspects. An overview of a field from a few years ago may now be out of date. Considering the popularity and promotion of deep learning in recent years, we provide a brief overview of deep learning and neural networks (NN), as well as its main progress and major breakthroughs in recent years. We hope this article will help many novice researchers in this field gain a comprehensive understanding of recent deep learning research and techniques, and guide them to get started in the right way. At the same time, we hope to pay tribute to the top DL and ANN researchers of this era through this work: Geoffrey Hinton (Hinton), Juergen Schmidhuber (Schmidhuber), Yann LeCun (LeCun), Yoshua Bengio (Bengio) and many other research scholars , whose research built modern artificial intelligence (AI). It is also crucial for us to follow up on their work to track the best current advances in DL and ML research. In this paper, we first briefly describe past research papers and study deep learning models and methods. We will then begin to describe recent advances in this area. We will discuss deep learning (DL) methods, deep architectures (i.e., deep neural networks (DNN)), and deep generative models (DGM), followed by important regularization and optimization methods. In addition, two short sections are used to summarize open source DL frameworks and important DL applications. We discuss the current state and future of deep learning in the last two chapters, Discussion and Conclusion. 2. Related Research In the past few years, there have been many review papers on deep learning. They describe in a good way DL methods, methodologies as well as their applications and future research directions. Here, we briefly introduce some excellent review papers on deep learning. Young et al. (2017) discuss DL models and architectures, primarily for natural language processing (NLP). They demonstrate DL applications in different NLP domains, compare DL models, and discuss possible future trends. Zhang et al. (2017) discuss the current best deep learning techniques for front-end and back-end speech recognition systems. Zhu et al. (2017) reviewed the recent progress in DL remote sensing technology. They also discuss open source DL frameworks and other technical details of deep learning.Wang et al. (2017) describe the evolution of deep learning models in a chronological manner. This short article briefly introduces the model and its breakthroughs in DL research. This article uses an evolutionary approach to understand the origins of deep learning, and explains the optimization and future research of neural networks.
Goodfellow et al. (2016) discussed deep networks and generative models in detail. Starting from the basic knowledge of machine learning (ML) and the advantages and disadvantages of deep architectures, they reviewed DL research and development in recent years. Applications are summarized.
LeCun et al. (2015) gave an overview of deep learning (DL) models from convolutional neural networks (CNN) and recurrent neural networks (RNN). They describe DL from a representation learning perspective, showing how DL techniques work, how they can be used successfully in various applications, and how they can learn to predict the future based on unsupervised learning (UL). They also point out the major advances in DL in the bibliography.
Schmidhuber (2015) gave an overview of deep learning from CNN, RNN and deep reinforcement learning (RL). He emphasizes RNNs for sequence processing, while pointing out the limitations of basic DL and NN, as well as tips for improving them.
Nielsen (2015) describes the details of neural networks with code and examples. He also discusses deep neural networks and deep learning to some extent.
Schmidhuber (2014) discusses the history and progress of time series-based neural networks, classification using machine learning methods, and the use of deep learning in neural networks.
Deng and Yu (2014) describe deep learning categories and techniques, as well as applications of DL in several areas.
Bengio (2013) provides a brief overview of DL algorithms from a representation learning perspective, i.e., supervised and unsupervised networks, optimization and training models. He focuses on many challenges of deep learning, such as: scaling algorithms for larger models and data, reducing optimization difficulties, designing efficient scaling methods, etc.
Bengio et al (2013) discussed representation and feature learning i.e. deep learning. They explore various approaches and models from the perspectives of applications, technologies, and challenges.
Deng (2011) provides an overview of deep structured learning and its architecture from the perspective of information processing and related fields.
Arel et al. (2010) provide a brief overview of DL technology in recent years.
Bengio (2009) discusses deep architectures, namely neural networks and generative models for artificial intelligence.
All recent papers on deep learning (DL) discuss the focus of deep learning from multiple perspectives. This is very necessary for DL researchers. However, DL is currently a booming field. After the recent DL overview paper, many new techniques and architectures have been proposed. In addition, previous papers have studied it from different perspectives. Our paper is primarily aimed at learners and novices who are new to the field. To this end, we will strive to provide a foundation and clear concept of deep learning for new researchers and anyone interested in this field.
In this section, we discuss recent advances derived from machine learning and artificial neural networks (ANN) Out of the major deep learning (DL) methods, artificial neural networks are the most commonly used form of deep learning.
Artificial Neural Networks (ANN) have made great progress, and also brought other depths Model. The first generation of artificial neural networks consisted of simple perceptron neural layers that could only perform limited simple calculations. The second generation uses backpropagation to update the weights of neurons based on the error rate. Then Support Vector Machines (SVM) came to the fore and overtook ANN for a while. To overcome the limitations of backpropagation, restricted Boltzmann machines (RBMs) were proposed to make learning easier. At this time, other technologies and neural networks also emerged, such as feedforward neural networks (FNN), convolutional neural networks (CNN), recurrent neural networks (RNN), etc., as well as deep belief networks, autoencoders, etc. Since then, ANNs have been improved and designed in different aspects for various purposes.
Schmidhuber (2014), Bengio (2009), Deng and Yu (2014), Goodfellow et al. (2016), Wang et al. (2017) on deep neural networks (DNN) Evolution and history as well as deep learning (DL) are given a detailed overview. In most cases, deep architectures are multi-layered non-linear iterations of simple architectures, allowing highly complex functions to be obtained from the input.
Deep neural networks have achieved great success in supervised learning. Additionally, deep learning models have been very successful in unsupervised, hybrid, and reinforcement learning.
Supervised learning is applied when data labeling, classifier classification or numerical prediction. LeCun et al. (2015) provide a streamlined explanation of supervised learning methods and the formation of deep structures. Deng and Yu (2014) mentioned and explained many deep networks for supervised and hybrid learning, such as deep stack network (DSN) and its variants. Schmidthuber's (2014) research covers all neural networks, from early neural networks to the more recent successes of convolutional neural networks (CNN), recurrent neural networks (RNN), long short-term memory (LSTM), and their improvements.
When the input data has no labels, unsupervised learning methods can be applied to extract features from the data and classify them or mark. LeCun et al. (2015) predict the future of unsupervised learning in deep learning. Schmidthuber (2014) also describes neural networks for unsupervised learning. Deng and Yu (2014) briefly introduced deep architectures for unsupervised learning and explained deep autoencoders in detail.
Reinforcement learning uses a reward and punishment system to predict the next step of a learning model. This is mainly used in games and robots to solve common decision-making problems. Schmidthuber (2014) describes advances in deep learning in reinforcement learning (RL) and the application of deep feedforward neural networks (FNN) and recurrent neural networks (RNN) in RL. Li (2017) discusses Deep Reinforcement Learning (DRL), its architecture (such as Deep Q-Network, DQN), and its applications in various fields.
Mnih et al. (2016) proposed a DRL framework for DNN optimization using asynchronous gradient descent.
van Hasselt et al. (2015) proposed a DRL architecture using a deep neural network (DNN).
In this section, we will briefly discuss deep neural networks (DNN), and their recent Improvements and breakthroughs. Neural networks function similarly to the human brain. They are mainly composed of neurons and connections. When we say deep neural networks, we can assume that there are quite a few hidden layers that can be used to extract features and compute complex functions from the input. Bengio (2009) explains deeply structured neural networks such as convolutional neural networks (CNN), autoencoders (AE), etc. and their variants. Deng and Yu (2014) provide a detailed introduction to some neural network architectures such as AE and its variants. Goodfellow et al. (2016) introduced and technically explained deep feedforward networks, convolutional networks, recurrent networks and their improvements. Schmidhuber (2014) mentions a complete history of neural networks from early neural networks to recent successful technologies.
Autoencoders (AE) are neural networks (NN) where the output is the input. AE takes raw input, encodes it into a compressed representation, and then decodes it to reconstruct the input. In deep AE, low hidden layers are used for encoding, high hidden layers are used for decoding, and error backpropagation is used for training.
5.1.1 Variational Autoencoder
Variational Autoencoder (VAE) can be counted as decoding device. VAEs are built on standard neural networks and can be trained via stochastic gradient descent (Doersch, 2016).
5.1.2 Multi-layer denoising autoencoder
In early autoencoders (AE) , the encoding layer has smaller (narrower) dimensions than the input layer. In multi-layer denoising autoencoders (SDAE), the encoding layer is wider than the input layer (Deng and Yu, 2014).
5.1.3 Transformational Autoencoders
Deep autoencoders (DAEs) can be transformative , that is, the features extracted from multi-layer nonlinear processing can be changed according to the needs of the learner. Transforming autoencoders (TAEs) can use both input vectors and target output vectors to apply transformation invariance properties to guide the code in the desired direction (Deng and Yu, 2014).
Four basic ideas constitute the convolutional neural network (CNN), namely: local connection, sharing Weights, pooling and multi-layer usage. The first part of CNN consists of convolutional layers and pooling layers, and the latter part is mainly a fully connected layer. Convolutional layers detect local connections of features, and pooling layers merge similar features into one. CNN uses convolution instead of matrix multiplication in the convolutional layer.
Krizhevsky et al. (2012) proposed a deep convolutional neural network (CNN) architecture, also known as AlexNet, which is a major step in deep learning (DL). breakthrough. The network consists of 5 convolutional layers and 3 fully connected layers. The architecture uses a graphics processing unit (GPU) for convolution operations, a rectified linear function (ReLU) as the activation function, and Dropout to reduce overfitting.
Iandola et al. (2016) proposed a small CNN architecture called "SqueezeNet".
Szegedy et al. (2014) proposed a deep CNN architecture named Inception. Dai et al. (2017) proposed improvements to Inception-ResNet.
Redmon et al. (2015) proposed a CNN architecture called YOLO (You Only Look Once) for uniform and real-time object detection.
Zeiler and Fergus (2013) proposed a method to visualize activations within CNNs.
Gehring et al. (2017) proposed a CNN architecture for sequence-to-sequence learning.
Bansal et al. (2017) proposed PixelNet, which uses pixels to represent.
Goodfellow et al. (2016) explain the basic architecture and ideas of CNN. Gu et al. (2015) provide a good overview of recent advances in CNNs, multiple variants of CNNs, architectures of CNNs, regularization methods and capabilities, and applications in various fields.
5.2.1 Deep Max Pooling Convolutional Neural Network
Max pooling convolutional neural network (MPCNN) mainly operates on convolution and max pooling, especially in digital image processing. MPCNN usually consists of three layers besides the input layer. The convolutional layer takes the input image and generates feature maps, then applies a nonlinear activation function. The max pooling layer downsamples the image and keeps the maximum value of the sub-region. Fully connected layers perform linear multiplication. In deep MPCNN, convolution and hybrid pooling are periodically used after the input layer, followed by a fully connected layer.
5.2.2 Very deep convolutional neural network
Simonyan and Zisserman (2014) proposed a very deep convolutional neural network Convolutional Neural Network (VDCNN) architecture, also known as VGG Net. VGG Net uses very small convolutional filters with a depth of 16-19 layers. Conneau et al. (2016) proposed another VDCNN architecture for text classification using small convolutions and pooling. They claim that this VDCNN architecture is the first to be used in text processing and it works at the character level. The architecture consists of 29 convolutional layers.
Lin et al. (2013) proposed Network In Network (NIN). NIN replaces the convolutional layers of traditional convolutional neural networks (CNN) with micro-neural networks with complex structures. It uses multi-layer perceptron (MLPConv) processing micro-neural networks and global average pooling layers instead of fully connected layers. Deep NIN architectures can be composed of multiple superpositions of NIN structures.
Girshick et al. (2014) proposed a region-based convolutional neural network (R- CNN), using regions for recognition. R-CNN uses regions to localize and segment objects. The architecture consists of three modules: class-independent region proposals that define a collection of candidate regions, a large convolutional neural network (CNN) that extracts features from the regions, and a set of class-specific linear support vector machines (SVMs).
5.4.1 Fast R-CNN
Girshick (2015) proposed fast region-based convolution Network (Fast R-CNN). This method leverages the R-CNN architecture to produce results quickly. Fast R-CNN consists of convolutional and pooling layers, region proposal layers, and a series of fully connected layers.
5.4.2 Faster R-CNN
Ren et al. (2015) proposed a faster region-based Convolutional Neural Network (Faster R-CNN), which uses Region Proposal Network (RPN) for real-time target detection. RPN is a fully convolutional network capable of generating region proposals accurately and efficiently (Ren et al., 2015).
5.4.3 Mask R-CNN
He Kaiming et al. (2017) proposed a region-based mask Convolutional network (Mask R-CNN) instance object segmentation. Mask R-CNN extends the architecture of R-CNN and uses an additional branch for predicting target masks.
5.4.4 Multi-Expert R-CNN
Lee et al. (2017) proposed a region-based Multi-expert convolutional neural network (ME R-CNN) utilizes the Fast R-CNN architecture. ME R-CNN generates regions of interest (RoI) from selective and exhaustive searches. It also uses a per-RoI multi-expert network instead of a single per-RoI network. Each expert is the same architecture with fully connected layers from Fast R-CNN.
The residual network (ResNet) proposed by He et al. (2015) consists of 152 layers. ResNet has low error and is easy to train through residual learning. Deeper ResNet can achieve better performance. In the field of deep learning, ResNet is considered an important advancement.
5.5.1 Resnet in Resnet
Targ et al. (2016) In Resnet in Resnet (RiR) Proposed to combine ResNets and standard convolutional neural networks (CNN) into a deep two-stream architecture.
5.5.2 ResNeXt
##Xie et al. (2016) proposed the ResNeXt architecture. ResNext leverages ResNets to reuse the split-transform-merge strategy. 5.6 Capsule Network Sabour et al. (2017) proposed a capsule network (CapsNet), which consists of two convolutional layers and A fully connected layer architecture. CapsNet usually contains multiple convolutional layers, with capsule layers at the end. CapsNet is considered one of the latest breakthroughs in deep learning because it is said to be based on the limitations of convolutional neural networks. It uses layers of capsules instead of neurons. Activated lower-level capsules make predictions, and after agreeing on multiple predictions, higher-level capsules become active. A protocol routing mechanism is used within these capsule layers. Hinton later proposed EM routing, which improved CapsNet using the expectation maximization (EM) algorithm. 5.7 Recurrent Neural Network Recurrent neural network (RNN) is more suitable for sequence inputs such as speech, text and generated sequences. A repeated hidden unit when unrolled in time can be thought of as a very deep feed-forward network with the same weights. RNNs used to be difficult to train due to vanishing gradient and dimensionality explosion problems. In order to solve this problem, many people later proposed improvements. Goodfellow et al. (2016) provide a detailed analysis of the details of recurrent and recurrent neural networks and architectures, as well as related gating and memory networks.Karpathy et al. (2015) use character-level language models to analyze and visualize predictions, characterize training dynamics, error types of RNNs and their variants (such as LSTM), etc.
J´ozefowicz et al (2016) explore the limitations of RNN models and language models.
5.7.1 RNN-EM
##Peng and Yao (2015) proposed the use of external memory (RNN- EM) to improve the memory ability of RNN. They claim to achieve state-of-the-art performance in language understanding, better than other RNNs.5.7.2 GF-RNN
##Chung et al. (2015) proposed a gated feedback recurrent neural network ( GF-RNN), which extends standard RNN by overlaying multiple recurrent layers with global gating units.
5.7.3 CRF-RNN
Zheng et al. (2015) proposed conditional random fields as recurrent neural networks (CRF-RNN), which combines convolutional neural networks (CNN) and conditional random fields (CRF) for probabilistic graphical modeling.
5.7.4 Quasi-RNN
Bradbury et al. (2016) proposed a method for neural sequence modeling and parallel application of quasi-recurrent neural networks (QRNN) along time steps.
5.8 Memory Network
Weston et al. (2014) proposed the Question Answering Memory Network (QA). The memory network consists of memory, input feature mapping, generalization, output feature mapping and response.
5.8.1 Dynamic Memory Network
Kumar et al. (2015) proposed a dynamic memory network for QA tasks Memory Network (DMN). DMN has four modules: input, question, episodic memory, and output.
5.9 Augmented Neural Networks
Olah and Carter (2016) nicely demonstrate attention and augmented recurrent neural networks, i.e. neural graphs NTM (NTM), attention interfaces, neural encoders, and adaptive computation time. Neural networks are often enhanced using additional properties such as logistic functions as well as standard neural network architectures.
5.9.1 Neural Turing Machine
Graves et al. (2014) proposed the Neural Turing Machine (NTM) The architecture consists of a neural network controller and a memory bank. NTM typically combines an RNN with an external memory bank.
5.9.2 Neural GPU##Kaiser and Sutskever (2015) proposed neural GPU to solve the problem of NTM Parallel issues.
5.9.3 Neural Random Access Machine
Kurach et al. (2015) proposed neural random access machine, which uses external variable-size random access memory.
5.9.4 Neural Programmer
Neelakantan et al. (2015) proposed a neural programmer, a Enhanced neural networks with arithmetic and logical functions.
5.9.5 Neural Programmer-Interpreter
Reed and de Freitas (2015) proposed that it can learn The Neural Programmer-Interpreter (NPI). NPI includes periodic kernels, program memory, and domain-specific encoders. 5.10 Long short-term memory network Hochreiter and Schmidhuber (1997) proposed Long Short-Term Memory (LSTM) ), overcoming the error backflow problem of recurrent neural networks (RNN). LSTM is a learning algorithm based on recurrent networks and gradient-based. LSTM introduces self-loop generation paths to enable gradients to flow. Greff et al. (2017) performed a large-scale analysis of standard LSTM and 8 LSTM variants for speech recognition, handwriting recognition, and polyphonic music modeling respectively. They claimed that the 8 variants of LSTM showed no significant improvement, while only the standard LSTM performed well. Shi et al. (2016b) proposed the deep long short-term memory network (DLSTM), which is a stack of LSTM units for feature map learning representation.
5.10.1 Batch-normalized LSTM
Cooijmans et al. (2016) proposed the batch-normalized LSTM Normalized LSTM (BN-LSTM), which uses batch-normalization on the hidden states of recurrent neural networks.
5.10.2 Pixel RNN
##van den Oord et al. (2016b) proposed the Pixel Recurrent Neural Network (Pixel -RNN), consisting of 12 two-dimensional LSTM layers.
5.10.3 Bidirectional LSTM##W¨ollmer et al. (2010) proposed the bidirectional LSTM (BLSTM) The recurrent network is used together with the dynamic Bayesian network (DBN) for context-sensitive keyword detection.
5.10.4 Variational Bi-LSTM
Shabanian et al. (2017) proposed a variational bi-LSTM ( Variational Bi-LSTM), which is a variant of the bidirectional LSTM architecture. Variational Bi-LSTM uses variational autoencoders (VAEs) to create an information exchange channel between LSTMs to learn better representations.
5.11 Google Neural Machine Translation Wu et al. (2016) proposed an automatic translation system called Google Neural Machine Translation (GNMT) , this system combines an encoder network, a decoder network and an attention network, following a common sequence-to-sequence learning framework. 5.12 Fader Network Lample et al. (2017) proposed the Fader Network, which is a new encoder-decoder architecture , to generate realistic input image changes by changing attribute values.The Hyper Networks proposed by Ha et al. (2016) generate weights for other neural networks, such as static hyper network convolutional networks, for Dynamic hypernetworks of recurrent networks.
Deutsch(2018) Generating neural networks using hypernetworks.
Srivastava et al. (2015) proposed Highway Networks to learn by using gated units management information. The flow of information across multiple levels is called the information highway.
5.14.1 Recurrent Highway Networks
Zilly et al. (2017) proposed Recurrent Highway Networks Highway Networks (RHN), which extends the long short-term memory (LSTM) architecture. RHN uses the Highway layer in periodic transitions.
Zhang et al. (2016) proposed High-Long Short-Term Memory (HLSTM) RNN, which extends a deep LSTM network with closed directional connections (i.e. Highways) between memory units of adjacent layers.
Donahue et al. (2014) proposed the long-term recurrent convolutional network (LRCN), which uses CNN for input , and then use LSTM to perform recursive sequence modeling and generate predictions.
Zhang et al. (2015) proposed Deep Neural SVM (DNSVM), which uses support vector machine (Support Vector Machine (SVM) as the top layer of Deep Neural Network (DNN) classification.
Moniz and Pal (2016) proposed a convolutional residual memory network, which combines the memory mechanism with into a convolutional neural network (CNN). It uses a long short-term memory mechanism to enhance the convolutional residual network.
Salimans et al. (2016) proposed several methods for training GANs.
6.5.1 Laplacian Generative Adversarial Network
Denton et al. (2015) proposed a Deep generative models (DGM), called Laplacian Generative Adversarial Networks (LAPGAN), use the Generative Adversarial Network (GAN) approach. The model also uses convolutional networks in a Laplacian pyramid framework.
Shi et al. (2016a) proposed the Recurrent Support Vector Machine (RSVM), using Recurrent Neural Network ( RNN) extracts features from the input sequence and uses standard support vector machine (SVM) for sequence-level target recognition.
In this section, we will briefly outline some of the main techniques for regularization and optimization Deep Neural Network (DNN).
Srivastava et al. (2014) proposed Dropout to prevent neural networks from overfitting. Dropout is a neural network model average regularization method by adding noise to its hidden units. During training, it randomly draws units and connections from the neural network. Dropout can be used in graphical models like RBM (Srivastava et al., 2014) or in any type of neural network. A recently proposed improvement on Dropout is Fraternal Dropout for Recurrent Neural Networks (RNN).
Goodfellow et al. (2013) proposed Maxout, a new activation function for Dropout. The output of Maxout is the maximum value of a set of inputs, which is beneficial to the model averaging of Dropout.
Krueger et al. (2016) proposed Zoneout, a regularization method for recurrent neural networks (RNN). Zoneout randomly uses noise during training, similar to Dropout, but retains hidden units instead of discarding them.
He et al. (2015) proposed a deep residual learning framework, which is called low training Error ResNet.
Ioffe and Szegedy (2015) proposed batch normalization by reducing internal covariate shifts. Methods for accelerating deep neural network training. Ioffe (2017) proposed batch normalization, which extended previous methods.
Hinton et al. (2015) proposed to transform knowledge from a collection of highly regularized models (i.e., neural networks) into Methods for compressing small models.
Ba et al. (2016) proposed layer normalization, especially for deep neural networks of RNN Accelerates training and solves the limitations of batch normalization.
There are a large number of open source libraries and frameworks available for deep learning. Most of them are built for the Python programming language. Such as Theano, Tensorflow, PyTorch, PyBrain, Caffe, Blocks and Fuel, CuDNN, Honk, ChainerCV, PyLearn2, Chainer, torch, etc.
In this section, we will briefly discuss some of the recent outstanding applications of deep learning. Since the beginning of deep learning (DL), DL methods have been widely used in various fields in the form of supervised, unsupervised, semi-supervised or reinforcement learning. Starting from classification and detection tasks, DL applications are rapidly expanding into every domain.
For example:
Image classification and recognition
Video classification
Sequence generation
Defect classification
Text, Speech, Image and Video Processing
Text Classification
Speech Processing
Speech Recognition and Spoken Language Understanding
Text-to-Speech Generation
Query classification
Sentence classification
Sentence modeling
lexical processing
Pre-selection
Document and sentence processing
Generate image text description
Photo style transfer
Natural image manifold
Image coloring
Image Q&A
Generate textured and stylized images
Visual and text Q&A
Visual recognition and description
Object recognition
Document processing
People Action Synthesis and Editing
Song Synthesis
Identity Recognition
Face Recognition and Verification
Video Action Recognition
Human Action Recognition
Action Recognition
Classification and Visualization of Motion Capture Sequences
Handwriting Generation and Prediction
Automation and Machine Translation
Named Entity Recognition
Mobile Vision
Conversational Agent
Calling Genetic Variation
Cancer Detection
X-Ray CT Reconstruction
Seizure Prediction
Hardware Acceleration
Robot
etc.
Deng and Yu (2014) provide a detailed list of DL applications in speech processing, information retrieval, object recognition, computer vision, multi-modal, multi-task learning and other fields.
Using Deep Reinforcement Learning (DRL) to master games has become a hot topic today. Every now and then, AI robots are created using DNN and DRL that beat human world champions and chess grandmasters in strategy and other games, starting with just a few hours of training. For example, Go’s AlphaGo and AlphaGo Zero.
Although deep learning has achieved great success in many fields, it still has a long way to go. There are still many areas for improvement. As for limitations, there are quite a few examples. For example: Nguyen et al. showed that deep neural networks (DNN) are easily fooled when recognizing images. There are other issues such as the learned feature transferability proposed by Yosinski et al. Huang et al. proposed an architecture for neural network attack defense and argued that future work is needed to defend against these attacks. Zhang et al. proposed an experimental framework for understanding deep learning models. They believed that understanding deep learning requires rethinking and generalization.
Marcus provided an important review in 2018 of the role, limitations, and nature of Deep Learning (DL). He strongly pointed out the limitations of DL methods, which require more data, have limited capacity, cannot handle hierarchical structures, cannot perform open reasoning, cannot be fully transparent, cannot integrate with prior knowledge, and cannot distinguish cause and effect. He also mentioned that DL assumes a stable world, is implemented in an approximate manner, is difficult to engineer, and has the potential risk of over-hyping. Marcus believes that DL needs to be reconceptualized and look for possibilities in unsupervised learning, symbolic manipulation and hybrid models, gain insights from cognitive science and psychology, and take on bolder challenges.
While deep learning (DL) is advancing the world faster than ever before, there are still There are many aspects worth studying. We still don’t fully understand deep learning, how we can make machines smarter, closer to or smarter than humans, or learn like humans. DL has been solving many problems while applying technology to everything. But humans still face many problems, such as people still dying of hunger and food crises, cancer and other fatal diseases. We hope that deep learning and artificial intelligence will become even more dedicated to improving the quality of human life by conducting the most difficult scientific research. Last but not least, may our world become a better place.
The above is the detailed content of A 10,000-word review of deep learning suitable for novices. For more information, please follow other related articles on the PHP Chinese website!