This article is reprinted from the WeChat public account "Living in the Information Age". The author lives in the information age. To reprint this article, please contact the Living in the Information Age public account.
Convolutional Neural Network (CNN) is a special deep feed-forward network, which generally includes a data input layer, a convolution layer, an activation layer, and a downsampling layer. and fully connected layers.
The convolutional layer is an important unit in the convolutional neural network. It consists of a series of filtering data The essence of the convolution kernel is the linear superposition process of the weighted sum of the local area of the image and the weight of the convolution kernel. Image I is used as input, and a two-dimensional convolution kernel K is used for convolution. The convolution process can be expressed as:
Among them, I(i,j) is the value of the image at the position (i,j), and S(i,j) is the feature map obtained after the convolution operation.
The activation convolution operation is linear, can only perform linear mapping, and has limited expression ability. Therefore, to deal with nonlinear mapping problems, it is necessary to introduce a nonlinear activation function. To deal with different nonlinear problems, the activation functions introduced are also different. The commonly used ones are sigmoid, tanh, relu, etc.
The Sigmoid function expression is:
##The expression of Tanh function is:
Expression of Relu function The formula is:
The downsampling layer is also called the pooling layer, and is usually placed after several convolutional layers. to reduce the size of feature images. The pooling function uses the overall statistical characteristics of neighboring outputs at a certain position to replace the network's output at that position. Generally, the pooling layer has three functions: First, it reduces the feature dimension. The pooling operation is equivalent to another feature extraction process, which can remove redundant information and reduce the data processing volume of the next layer. The second is to prevent overfitting, and the pooling operation obtains more abstract information and improves generalization. The third is to maintain feature invariance, and the pooling operation retains the most important features.
The fully connected layer is usually placed at the end of the convolutional neural network, and all neurons between layers have weighted connections. The purpose is to map all the features learned in the network to the label space of the sample to make category judgments. The Softmax function is usually used in the last layer of the neural network as the output of the classifier. Each value output by the softmax function ranges between (0, 1).
There are some classic and efficient CNN models, such as: VGGNet, ResNet, AlexNet, etc., which have been widely used in the field of image recognition.
The above is the detailed content of Image Recognition: Convolutional Neural Network. For more information, please follow other related articles on the PHP Chinese website!