This article is reproduced from the WeChat public account "AI Origin", written by the author Beishang. To reprint this article, please contact the AI Yuanqi public account.
Can you tell numbers?
Speaking of AI, what everyone thinks of must be an intelligent hardware creature like MOSS in "The Wandering Earth 2". It seems to be omniscient and omnipotent. As long as it is given the Internet, it will be like Like Ultron in Avengers 2, he uses the Internet to spread himself everywhere, always thinking of plans to eliminate human beings to achieve "world peace."
#However, in reality, the current AI is still far from what everyone feels and experiences in film and television works. Take the picture today Taking digital recognition as an example, let’s explore in what form AI actually exists. The article will be indexed by some key questions to help you understand it step by step. Follow me, let's go ~
This is a picture with numbers. I believe you will be able to react immediately after seeing it. This is a picture with the number "3" (even though it rather vague).
The first question: Real intelligence - why can you clearly know that this is the number "3"? What does this mean?
When you look at this picture, your eyes convert the optical signal into a biological signal that the brain can recognize through light reflection on the retina, and temporarily store this part of the information (only to assist understanding, not actually ?) on your retina. After the brain receives the signal, your clever little brain quickly recognizes that this is the number "3". At this time, you have completely understood the picture, and it is a "3". Of course, the basis of all this is that you have been taught since childhood that numbers in this shape are "equivalent" to the number 3, not 5, 6 and other numbers.
Second question: Eyes, retina—what form of input does the computer use to recognize the physical world?
What is the relationship between computers and AI? We can simply understand that AI is a pseudo-intelligent ability that requires the computing power and architecture of a computer, just like we ourselves have intelligence and life, but in fact we are essentially carbon-based organisms. As we all know, the computer world is a binary world. What is binary? Simply put, it is either 0 or 1. I know you must have doubts at this time. Can you achieve so many functions just by relying on binary numbers that are either 0 or 1? Do you have such powerful computing power? But don’t worry, there is a concept that needs to be clarified here, that is, binary can represent numbers in any base (you can think it is correct first, and we will talk about it later if you need it specifically). For example, the number 13 in our commonly used decimal system, in binary The following is 1101. Friends who want to study carefully can see the explanation of the picture below.
#So we can clearly understand that for a number, the computer can "understand" it through binary. If pictures can be converted into a string of numbers, will the computer be able to change from a idiot that only knows 0 and 1 to one that can input information from pictures (if you don’t understand it, put it aside, just like a child must learn to eat first when he grows up) . As shown in the figure below, each small area of the picture can be considered as a pixel, and one pixel represents a color. As we all know, any color can be passed through red, green, blue ( Blue), then we can form a list of numbers in order from left to right and top to bottom, and then send these contents to the computer.
At this point, no matter whether the computer understands it or not, we have converted the picture into a signal that the computer can accept. How does the computer brain identify the signals in the picture? The number is "3"
The third question: the so-called AI-how should the computer determine that the number in this picture is "3"
Give the computer two pictures like this. It can tell you that the picture on the left is the number "3". Do you think it has artificial intelligence? You may think this is too naive, but even a 3-year-old child knows this. But if the picture on the right shows 10,000 pictures of blue-footed boobies and other rare birds, and it only takes a few seconds for the computer to accurately identify various rare creatures with 99% accuracy, isn’t that right? Is it a bit like AI?
Traditional recognition method - specifically, we have been able to convert the image into a digital matrix. The traditional image recognition method will identify the features in the image. For extraction, for example, some hard rules will be used as features. As for the number "3", when we see this shape in our brains, we subconsciously react that it is "3", but to the computer it is a string of numbers. So when doing similar image classification tasks in the early days, engineers needed to process the digital sequence mapped to the number "3", which was really a headache. So how to make features is a crucial but extremely cumbersome process in traditional image recognition and classification.
The advantage of the traditional recognition method is that when the recognition result is wrong, you can roughly determine the cause of the error by displaying features. The disadvantage is that feature engineering is cumbersome. Is there a way to weaken feature engineering (although feature engineering is also extremely important for many subsequent tasks) and provide an end-to-end solution. The so-called end-to-end means that I only need to give a digital picture and its classification results, and let the computer learn the recognition solution by itself (is it a bit like human learning ideas). After the changes of the times and the significant improvement of computer computing power, deep learning algorithms based on neural networks have gradually come into use.
Deep neural network recognition method - these words are very profound. Those who don’t learn computers or do algorithms may be directly frightened. Let me use one sentence. Translation Translation What is a deep neural network (dog head.jpg): There is some kind of non-linear correlation between the input data and the specified label. The neural network uses multiple non-linear functions to approximately fit the above-mentioned non-linear correlation. As shown below, it is a simple deep neural network. The leftmost picture (letter "A"), and the right side is the structure that converts the picture into numbers and performs "intelligent" operations, which can be understood as the "brain".
The above is the detailed content of Computer Vision Image Classification. For more information, please follow other related articles on the PHP Chinese website!