One article to understand computer vision, full of useful information-AI-php.cn

Table of Contents

1. Introduction

2. Why is computer vision important

3. What is computer vision

4. Basic principles of computer vision

5. Typical tasks of computer vision

6. Application scenarios of computer vision in daily life

7. Challenges faced by computer vision

8. Conclusion

Home

Technology peripherals

One article to understand computer vision, full of useful information

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

May 16, 2023 pm 03:37 PM

computer technology

1. Introduction

Computer Vision (Computer Vision), usually referred to as CV, is a research field that uses technology to help computers "see" and "understand" images, such as enabling computers to understand photos or videos. Content.

This article will provide an overall introduction to computer vision. This article is divided into six parts, which are:

Why computer vision is important
What is computer vision
Basic principles of computer vision
Typical tasks of computer vision
Application scenarios of computer vision in daily life
Challenges facing computer vision

2. Why is computer vision important

Physiologically, vision begins with the excitement of the receptor cells of the visual organ, and is formed after the visual nervous system processes the collected information. We humans use vision to intuitively understand the shape and state of things in front of us. Most of us rely on vision to complete cooking, negotiate obstacles, read street signs, watch videos, and countless other tasks. In fact, if it were not for special groups such as the blind, the vast majority of people obtain external information through vision, and this proportion is as high as 80% - this proportion is not unfounded, according to the famous experimental psychologist Treicher has confirmed through a large number of experiments that 83% of the information humans obtain comes from vision, 11% from hearing, and the remaining 6% comes from smell, touch, and taste. Therefore, for humans, vision is undoubtedly the most important sense.

Not only humans are "visual animals", but for most animals, vision also plays a very important role. Through vision, humans and animals perceive the size, light and shade, color, and movement of external objects, and obtain various information that is important to the survival of the body. Through this information, they can learn what the surrounding world is like and how to interact with the world.

One article to understand computer vision, full of useful information

#Before the advent of computer vision, images were in a black box state for computers. To a computer, an image is just a file or a string of data. The computer does not know what the content of the picture is. It only knows what size the picture is, how much memory it occupies, what format it is in, etc.

One article to understand computer vision, full of useful information

If computers and artificial intelligence want to play an important role in the real world, they must understand pictures! Therefore, for half a century, computer scientists have been trying to figure out how to make computers see, giving rise to the field of "computer vision."

One article to understand computer vision, full of useful information

The rapid development of the Internet has also made computer vision particularly important. The figure below is a trend chart of the amount of new data on the network since 2020. Gray graphics are structured data, blue graphics are unstructured data (mostly pictures and videos). It is obvious that the number of pictures and videos is growing at an exponential rate.

One article to understand computer vision, full of useful information

The Internet is made up of text and images. Searching for text is relatively simple, but in order to search for images, the algorithm needs to know what the image contains. For a long time, humans did not have enough technology to understand the content of images and videos, and could only rely on manual annotation to obtain descriptions of images or videos. How to enable computers to better understand these image information is a major challenge facing today's computer technology. In order to make full use of image or video data, you need to let the computer "see" the image or video and understand the content.

3. What is computer vision

Computer vision is an important branch in the field of artificial intelligence. Simply put, the problem it solves is to let computers understand the content of images or videos. For example: Is the pet in the picture a cat or a dog? Is the person in the picture Lao Zhang or Lao Wang? What are the people in the video doing? Furthermore, computer vision refers to using cameras and computers instead of human eyes to identify, track and measure targets, and further perform graphics processing to obtain images that are more suitable for human eye observation or transmission to instruments for detection. As a scientific discipline, computer vision studies related theories and technologies, trying to build artificial intelligence systems that can obtain high-level information from images or multi-dimensional data. From an engineering perspective, it seeks to leverage automated systems to mimic the human visual system to complete tasks. The ultimate goal of computer vision is to enable computers to observe and understand the world through vision like humans do, and have the ability to adapt to the environment autonomously. But it is very difficult to truly realize that a computer can perceive the world through a camera, because although the images captured by the camera are the same as what we usually see, for the computer, any image is just an arrangement and combination of pixel values. A bunch of rigid numbers. How to allow computers to read meaningful visual clues from these rigid numbers is a problem that computer vision should solve.

4. Basic principles of computer vision

Anyone who has used a camera or mobile phone knows that computers are good at taking photos with amazing fidelity and details. To a certain extent, computers Artificial "vision" is much stronger than the natural visual ability of humans. But just as we usually say "hearing does not mean understanding", "seeing" does not mean "understanding". If you want a computer to truly "understand" images, it is not a simple matter. An image is a large grid of pixels, and each pixel has a color, which is a combination of three primary colors: red, green, and blue. By combining the intensities of three colors - called RGB values - we can get any color. The simplest and most suitable computer vision algorithm for getting started is: to track a colored object, such as a pink ball, we first note the color of the ball, save the RGB value of the center pixel, and then feed the image to the program , letting the program find the pixel closest to this color. The algorithm can start from the upper left corner, examine each pixel, and calculate the difference from the target color. After checking each pixel, the closest part of the pixels is likely to be the pixel where the ball is. This algorithm is not limited to running on this single image, we can run the algorithm on each frame of the video to track the position of the ball. Of course, due to the influence of light, shadow and other factors, the color of the ball will change. It will not be exactly the same as the RGB value we saved, but it will be very close. However, in some extreme cases, such as a football match at night, the tracking effect may be very poor; and if one of the teams' jerseys is the same color as the ball, the algorithm will be completely "fainted." Therefore, unless the environment can be strictly controlled, such color tracking algorithms are rarely put into practical use. Nowadays, more computer vision algorithms used generally involve "Deep Learning" methods and technologies. Among them, Convolutional Neural Network (CNN) is the most widely used because of its superior performance. Since the knowledge involved in "deep learning" is too extensive, this article will not describe it in more detail. If you want to learn more about "deep learning", you might as well take a look at the introductory AI course - "Intel® OpenVINO™ Tool Suite Elementary Course". It starts with the basic concepts of AI, introduces relevant knowledge of artificial intelligence and vision applications, and helps users quickly understand the basic concepts and application scenarios of the Intel® OpenVINO™ tool suite. The entire course includes video processing, knowledge related to deep learning, inference acceleration for artificial intelligence applications, and Demo demonstrations of the Intel® OpenVINO™ tool suite. It takes you step by step to master deep learning from the shallower to the deeper.

5. Typical tasks of computer vision

Image classification

Image classification is to distinguish different categories of images based on the semantic information of the image. It is a computer The core of vision is the basis for other high-level visual tasks such as object detection, image segmentation, object tracking, behavior analysis, and face recognition. For example, in the picture below, through image classification, the computer recognizes person, tree, grass, and sky in the image.

One article to understand computer vision, full of useful information

Image classification is widely used in many fields, such as: face recognition and intelligent video analysis in the security field, traffic scene recognition in the transportation field, and Internet-based Image retrieval of content and automatic classification of photo albums, image recognition in the medical field, etc.

Object detection

The goal of the target detection task is to give an image or a video frame, let the computer find the positions of all targets in it, and give each target specific categories. As shown in the figure below, taking recognition and detection of people as an example, the borders are used to mark the positions of all people in the image.

One article to understand computer vision, full of useful information

In multi-category target detection, borders of different colors are generally used to mark the positions of different detected objects, as shown in the figure below.

One article to understand computer vision, full of useful information

Semantic Segmentation

Semantic segmentation is a basic task in computer vision. In semantic segmentation we need to divide the visual input into Different semantic interpretable categories. It divides the entire image into groups of pixels, which are then labeled and classified. For example, we might want to distinguish all pixels in an image that belong to cars and color those pixels blue. As shown below, the image is divided into people (red), trees (dark green), grass (light green), and sky (blue) labels.

One article to understand computer vision, full of useful information

Instance segmentation Instance segmentation is a combination of target detection and semantic segmentation. The target is detected in the image (target detection), and then each pixel is labeled (semantic segmentation) ). Comparing the figures above and below, we can see that if human targets are used, semantic segmentation does not distinguish different instances belonging to the same category (all people are marked in red), while instance segmentation distinguishes different instances of the same category (different colors are used to distinguish different people).

One article to understand computer vision, full of useful information

Target tracking Target tracking refers to the detection, extraction, identification and tracking of moving targets in image sequences, obtaining the motion parameters of the moving targets, processing and analysis, and achieving Behavioral understanding of moving targets to complete higher-level detection tasks.

One article to understand computer vision, full of useful information

6. Application scenarios of computer vision in daily life

The application scenarios of computer vision are very wide. Here are a few common application scenarios in life. . · Face recognition for access control and Alipay

One article to understand computer vision, full of useful information

License plate recognition for parking lots and toll stations

One article to understand computer vision, full of useful information

Risk identification when uploading videos to websites or APPs

One article to understand computer vision, full of useful information

Various selfie props on Douyin and other APPs (required First identify the position of the face)

One article to understand computer vision, full of useful information

7. Challenges faced by computer vision

Currently, computer vision technology is developing rapidly and has preliminary industry scale. The development of computer vision technology in the future mainly faces the following challenges: First, how to better combine it with other technologies in different application fields. Computer vision can make extensive use of big data when solving certain problems. It has gradually matured and can surpass humans, and However, it is impossible to achieve high accuracy on some problems; the second is how to reduce the development time and labor costs of computer vision algorithms. Currently, computer vision algorithms require a large amount of data and manual annotation, and require a long research and development cycle to reach the requirements of the application field. The required accuracy and time-consuming; the third is how to speed up the design and development of new algorithms. With the emergence of new imaging hardware and artificial intelligence chips, the design and development of computer vision algorithms for different chips and data acquisition equipment is also one of the challenges.

8. Conclusion

Computer vision is one of the fastest growing and most widely used technologies in the field of artificial intelligence. It is like the "eyes" of artificial intelligence, capturing images for all walks of life. and analyze more information. With the change of algorithms, the upgrade of hardware computing power, the explosion of data, and the high-speed network brought about by the development of 5G technology in the future, computer vision will also have a broader development space in terms of applications. Let us wait and see!

The above is the detailed content of One article to understand computer vision, full of useful information. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks ago By DDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

WWE 2K25: How To Unlock Everything In MyRise

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7509

CakePHP Tutorial

1378

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

2024 CSRankings National Computer Science Rankings Released! CMU dominates the list, MIT falls out of the top 5 Mar 25, 2024 pm 06:01 PM

The 2024CSRankings National Computer Science Major Rankings have just been released! This year, in the ranking of the best CS universities in the United States, Carnegie Mellon University (CMU) ranks among the best in the country and in the field of CS, while the University of Illinois at Urbana-Champaign (UIUC) has been ranked second for six consecutive years. Georgia Tech ranked third. Then, Stanford University, University of California at San Diego, University of Michigan, and University of Washington tied for fourth place in the world. It is worth noting that MIT's ranking fell and fell out of the top five. CSRankings is a global university ranking project in the field of computer science initiated by Professor Emery Berger of the School of Computer and Information Sciences at the University of Massachusetts Amherst. The ranking is based on objective

Remote Desktop cannot authenticate the remote computer's identity Feb 29, 2024 pm 12:30 PM

Windows Remote Desktop Service allows users to access computers remotely, which is very convenient for people who need to work remotely. However, problems can be encountered when users cannot connect to the remote computer or when Remote Desktop cannot authenticate the computer's identity. This may be caused by network connection issues or certificate verification failure. In this case, the user may need to check the network connection, ensure that the remote computer is online, and try to reconnect. Also, ensuring that the remote computer's authentication options are configured correctly is key to resolving the issue. Such problems with Windows Remote Desktop Services can usually be resolved by carefully checking and adjusting settings. Remote Desktop cannot verify the identity of the remote computer due to a time or date difference. Please make sure your calculations

Unable to open the Group Policy object on this computer Feb 07, 2024 pm 02:00 PM

Occasionally, the operating system may malfunction when using a computer. The problem I encountered today was that when accessing gpedit.msc, the system prompted that the Group Policy object could not be opened because the correct permissions may be lacking. The Group Policy object on this computer could not be opened. Solution: 1. When accessing gpedit.msc, the system prompts that the Group Policy object on this computer cannot be opened because of lack of permissions. Details: The system cannot locate the path specified. 2. After the user clicks the close button, the following error window pops up. 3. Check the log records immediately and combine the recorded information to find that the problem lies in the C:\Windows\System32\GroupPolicy\Machine\registry.pol file

The Stable Diffusion 3 paper is finally released, and the architectural details are revealed. Will it help to reproduce Sora? Mar 06, 2024 pm 05:34 PM

StableDiffusion3’s paper is finally here! This model was released two weeks ago and uses the same DiT (DiffusionTransformer) architecture as Sora. It caused quite a stir once it was released. Compared with the previous version, the quality of the images generated by StableDiffusion3 has been significantly improved. It now supports multi-theme prompts, and the text writing effect has also been improved, and garbled characters no longer appear. StabilityAI pointed out that StableDiffusion3 is a series of models with parameter sizes ranging from 800M to 8B. This parameter range means that the model can be run directly on many portable devices, significantly reducing the use of AI

This article is enough for you to read about autonomous driving and trajectory prediction! Feb 28, 2024 pm 07:20 PM

Trajectory prediction plays an important role in autonomous driving. Autonomous driving trajectory prediction refers to predicting the future driving trajectory of the vehicle by analyzing various data during the vehicle's driving process. As the core module of autonomous driving, the quality of trajectory prediction is crucial to downstream planning control. The trajectory prediction task has a rich technology stack and requires familiarity with autonomous driving dynamic/static perception, high-precision maps, lane lines, neural network architecture (CNN&GNN&Transformer) skills, etc. It is very difficult to get started! Many fans hope to get started with trajectory prediction as soon as possible and avoid pitfalls. Today I will take stock of some common problems and introductory learning methods for trajectory prediction! Introductory related knowledge 1. Are the preview papers in order? A: Look at the survey first, p

DualBEV: significantly surpassing BEVFormer and BEVDet4D, open the book! Mar 21, 2024 pm 05:21 PM

This paper explores the problem of accurately detecting objects from different viewing angles (such as perspective and bird's-eye view) in autonomous driving, especially how to effectively transform features from perspective (PV) to bird's-eye view (BEV) space. Transformation is implemented via the Visual Transformation (VT) module. Existing methods are broadly divided into two strategies: 2D to 3D and 3D to 2D conversion. 2D-to-3D methods improve dense 2D features by predicting depth probabilities, but the inherent uncertainty of depth predictions, especially in distant regions, may introduce inaccuracies. While 3D to 2D methods usually use 3D queries to sample 2D features and learn the attention weights of the correspondence between 3D and 2D features through a Transformer, which increases the computational and deployment time.

Unable to copy data from remote desktop to local computer Feb 19, 2024 pm 04:12 PM

If you have problems copying data from a remote desktop to your local computer, this article can help you resolve it. Remote desktop technology allows multiple users to access virtual desktops on a central server, providing data protection and application management. This helps ensure data security and enables companies to manage their applications more efficiently. Users may face challenges while using Remote Desktop, one of which is the inability to copy data from the Remote Desktop to the local computer. This may be caused by different factors. Therefore, this article will provide guidance on resolving this issue. Why can't I copy from the remote desktop to my local computer? When you copy a file on your computer, it is temporarily stored in a location called the clipboard. If you cannot use this method to copy data from the remote desktop to your local computer

'Minecraft' turns into an AI town, and NPC residents role-play like real people Jan 02, 2024 pm 06:25 PM

Please note that this square man is frowning, thinking about the identities of the "uninvited guests" in front of him. It turned out that she was in a dangerous situation, and once she realized this, she quickly began a mental search to find a strategy to solve the problem. Ultimately, she decided to flee the scene and then seek help as quickly as possible and take immediate action. At the same time, the person on the opposite side was thinking the same thing as her... There was such a scene in "Minecraft" where all the characters were controlled by artificial intelligence. Each of them has a unique identity setting. For example, the girl mentioned before is a 17-year-old but smart and brave courier. They have the ability to remember and think, and live like humans in this small town set in Minecraft. What drives them is a brand new,

See all articles