Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework-AI-php.cn

Table of Contents

Method

Home

Technology peripherals

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

王林

Apr 14, 2023 pm 04:31 PM

system train

This article introduces the paper "Improving Training and Inference of Face Recognition Models via Random Temperature Scaling" accepted by AAAI 2023, the top international conference on machine learning. This paper innovatively analyzes the internal relationship between the temperature adjustment parameter and classification uncertainty in the classification loss function from a probabilistic perspective, revealing that the temperature adjustment factor of the classification loss function is the scale coefficient of the uncertainty variable obeying the Gumbel distribution. . Therefore, a new training framework called RTS is proposed to model the reliability of feature extraction. Based on the RTS training framework, a more reliable recognition model is trained, making the training process more stable, and providing a measurement score of sample uncertainty during deployment to reject high-uncertain samples and help build a more robust vision recognition system. Extensive experiments show that RTS can train stably and output uncertainty measures to build a robust visual recognition system.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

##Paper address: https://arxiv.org/abs/2212.01015
Open source model: https://modelscope.cn/models/damo/cv_ir_face-recognition-ood_rts/summary

Background

Uncertainty problem:Visual recognition systems usually encounter a variety of interferences in real scenes. For example: occlusion (decoration or complex foreground), imaging blur (focus blur or motion blur), extreme lighting (overexposure or underexposure, etc.). These interferences can be summarized as the influence of noise. In addition, there are misdetected pictures, usually cat faces or dog faces. These misdetected data are called out-of-distribution (OOD) data. For visual recognition, the above-mentioned noise and OOD data constitute a source of uncertainty. Affected samples will superimpose uncertainty on the features extracted based on the depth model, causing interference to the visual recognition system. For example, if the base library image is contaminated by samples with uncertain interference, a "feature black hole" will be formed, which will bring hidden dangers to the visual recognition system. There is therefore a need to model representation reliability.

Related work on characterization reliability modeling

Traditional multi-model solution

Traditional The method of controlling reliability in the visual recognition link is done through an independent quality model. The typical image quality modeling method is as follows:

1. Collect annotation data and annotate specific factors that affect quality, such as clarity, presence or absence of occlusion, and posture.

2. Map the quality score from 1 to 10 according to the label of the influencing factors. The higher the score, the better the quality. For specific examples, please refer to the example on the left side of the figure below.

3. After obtaining the quality score annotation from the first two steps, perform ordered regression training to predict the quality score during the deployment phase, as shown in the example on the right side of the figure below.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

The independent quality model solution requires the introduction of a new model in the visual recognition link, and the training relies on annotation information .

DUL

The uncertainty modeling method includes "Data Uncertainty Learning in Face Recognition". The features are modeled as the sum of the mean and variance of the Gaussian distribution, and the features containing uncertainty are sent to the subsequent classifier for training. Thus, the uncertainty score related to image quality can be obtained during the deployment stage.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

DUL uses a summation method to describe uncertainty. The scale of the noise estimate is also the same as that of a certain type of data. Feature distribution is closely related. If the data distribution is relatively tight, then the scale of the noise estimated by DUL is also relatively small. Work in the field of OOD points out that the density of data distribution is not a good metric for OOD identification.

GODIN

The work in the field of OOD "Generalized odin: Detecting out-of-distribution image without learning from out-of-distribution data" uses the form of joint probability distribution to process OOD data, using two independent branches h ( x) and g(x) estimate the classification probability value and the temperature adjustment value.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

Since the temperature value is modeled as a probability value and the range is limited to 0-1, the temperature is not better modeled .

Method

In view of the above problems and related work, this paper starts from the probability perspective and studies the relationship between the temperature adjustment factor and uncertainty in the classification loss function. After analysis, the RTS training framework is proposed.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

##Analysis of temperature regulation factors based on probability perspective

First analyze the relationship between the temperature adjustment factor and uncertainty. Assume that the uncertainty Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework is a random variable that conforms to the standard Gumbel distribution, then the probability density function can be written as

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework ,The cumulative distribution function is, and the probability value of classified into class k is:

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

Put Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework into the above formula to get:

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

##It can be seen that the probability value classified into k class is the score that conforms to the softmax function. At the same time, we can use a t to adjust the scale of uncertainty, that is, it conforms to the standard Gumbel distribution:

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

It can be seen that the probability value classified into class k at this time is consistent with the softmax function with a temperature adjustment value of t Score.

Modeling temperature

In order to reduce the impact of uncertainty estimation on classification, the temperature t needs to be near 1, so we model the temperature t as the sum of Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework independent gamma distribution variables:where, so that t obeys

##, beta = frac {alpha - 1} {v})$ distribution. The influence of v and Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework on the distribution is as shown below.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

The constraints on temperature modeling are implemented using the following regular terms during training

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

Training method

The overall algorithm is organized as:

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

For more detailed analysis and theoretical proof, please refer to the paper.

Result

In the training phase, the training data only contains face training data. The OOD data of falsely detected cat faces and dog faces is used to verify the recognition effect of OOD data during testing and the test illustrates the dynamic process of OOD sample uncertainty at different stages in the training process.

Training phase

We draw the in-distribution data (face) and out-of-distribution The uncertainty scores of the data (cat faces and dog faces mistakenly detected as faces) at different epoch numbers. From the figure below, you can see that the uncertainty scores of all samples in the initial stage are distributed near the larger values, and then As training progresses, the uncertainty of OOD samples gradually increases, and the uncertainty of face data gradually decreases. The better the face quality, the lower the uncertainty. ID data and OOD data can be distinguished by setting a threshold, and the image quality is reflected by the uncertainty score.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

To illustrate the robustness to noisy training data during the training phase. This article applies different proportions of noise to the training set. The model recognition effects based on different proportions of noise training data are as follows. It can be seen that RTS can also achieve better recognition results for training based on noise data.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

Deployment Phase

The picture below It shows that the uncertainty score obtained by the RTS framework during the deployment phase has a high correlation with the face quality

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

At the same time, the error matching curve after removing low-quality samples is plotted on the benchmark. Based on the obtained uncertainty scores, samples with higher uncertainty in the benchmark are removed in order of uncertainty from high to low, and then the error matching curves of the remaining samples are drawn. As can be seen from the figure below, as more samples with higher uncertainty are filtered, there are fewer false matches, and when the same number of uncertainty samples are removed, RTS has fewer false matches.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

#In order to verify the identification effect of the uncertainty score on OOD samples, an in-distribution data set was constructed during testing (face) and out-of-distribution data sets (cat faces and dog faces mistakenly detected as faces). Data sample is as follows.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

We explain the effect of RTS from two aspects. First, draw the distribution chart of uncertainty. As can be seen from the figure below, the RTS method has strong discrimination ability for OOD data.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

At the same time, the ROC curve on the OOD test set was also drawn, and the AUC value of the ROC authority was calculated, as you can see The uncertainty score of RTS can better identify OOD data.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

##General recognition ability

To test the general recognition ability on the benchmark, RTS increases the recognition ability of OOD data without affecting the face recognition ability. Using the RTS algorithm can achieve a balanced result in identification and OOD data identification.

Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework

#APPLY

This article The model is open sourced on modelscope. In addition, I would like to introduce to you the open source free models on the CV domain. Everyone is welcome to experience and download (you can experience it on most mobile phones):

1.https://modelscope.cn/ models/damo/cv_resnet50_face-detection_retinaface/summary

##2.https://modelscope.cn/models/damo/cv_resnet101_face-detection_cvpr22papermogface/summary

3.https://modelscope.cn/models/damo/cv_manual_face-detection_tinymog/summary

4.https://modelscope.cn/models/damo/cv_manual_face-detection_ulfd /summary

5.https://modelscope.cn/models/damo/cv_manual_face-detection_mtcnn/summary

6.https:/ /modelscope.cn/models/damo/cv_resnet_face-recognition_facemask/summary

7.https://modelscope.cn/models/damo/cv_ir50_face-recognition_arcface/summary

8. https://modelscope.cn/models/damo/cv_manual_face-liveness_flir/summary

9.https://modelscope.cn/models/ damo/cv_manual_face-liveness_flrgb/summary

10.https://modelscope.cn/models/damo/cv_manual_facial-landmark-confidence_flcm/summary

11.https://modelscope.cn/models/damo/cv_vgg19_facial-expression-recognition_fer/summary

12.https://modelscope.cn/models/damo/cv_resnet34_face -attribute-recognition_fairface/summary

The above is the detailed content of Worrying image quality interferes with visual recognition, DAMO Academy proposes a more robust framework. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

3 weeks ago By DDD

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

4 weeks ago By DDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7577

CakePHP Tutorial

1386

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

111

Related knowledge

Open source! Beyond ZoeDepth! DepthFM: Fast and accurate monocular depth estimation! Apr 03, 2024 pm 12:04 PM

0.What does this article do? We propose DepthFM: a versatile and fast state-of-the-art generative monocular depth estimation model. In addition to traditional depth estimation tasks, DepthFM also demonstrates state-of-the-art capabilities in downstream tasks such as depth inpainting. DepthFM is efficient and can synthesize depth maps within a few inference steps. Let’s read about this work together ~ 1. Paper information title: DepthFM: FastMonocularDepthEstimationwithFlowMatching Author: MingGui, JohannesS.Fischer, UlrichPrestel, PingchuanMa, Dmytr

CUDA's universal matrix multiplication: from entry to proficiency! Mar 25, 2024 pm 12:30 PM

General Matrix Multiplication (GEMM) is a vital part of many applications and algorithms, and is also one of the important indicators for evaluating computer hardware performance. In-depth research and optimization of the implementation of GEMM can help us better understand high-performance computing and the relationship between software and hardware systems. In computer science, effective optimization of GEMM can increase computing speed and save resources, which is crucial to improving the overall performance of a computer system. An in-depth understanding of the working principle and optimization method of GEMM will help us better utilize the potential of modern computing hardware and provide more efficient solutions for various complex computing tasks. By optimizing the performance of GEMM

Huawei's Qiankun ADS3.0 intelligent driving system will be launched in August and will be launched on Xiangjie S9 for the first time Jul 30, 2024 pm 02:17 PM

On July 29, at the roll-off ceremony of AITO Wenjie's 400,000th new car, Yu Chengdong, Huawei's Managing Director, Chairman of Terminal BG, and Chairman of Smart Car Solutions BU, attended and delivered a speech and announced that Wenjie series models will be launched this year In August, Huawei Qiankun ADS 3.0 version was launched, and it is planned to successively push upgrades from August to September. The Xiangjie S9, which will be released on August 6, will debut Huawei’s ADS3.0 intelligent driving system. With the assistance of lidar, Huawei Qiankun ADS3.0 version will greatly improve its intelligent driving capabilities, have end-to-end integrated capabilities, and adopt a new end-to-end architecture of GOD (general obstacle identification)/PDP (predictive decision-making and control) , providing the NCA function of smart driving from parking space to parking space, and upgrading CAS3.0

Hello, electric Atlas! Boston Dynamics robot comes back to life, 180-degree weird moves scare Musk Apr 18, 2024 pm 07:58 PM

Boston Dynamics Atlas officially enters the era of electric robots! Yesterday, the hydraulic Atlas just "tearfully" withdrew from the stage of history. Today, Boston Dynamics announced that the electric Atlas is on the job. It seems that in the field of commercial humanoid robots, Boston Dynamics is determined to compete with Tesla. After the new video was released, it had already been viewed by more than one million people in just ten hours. The old people leave and new roles appear. This is a historical necessity. There is no doubt that this year is the explosive year of humanoid robots. Netizens commented: The advancement of robots has made this year's opening ceremony look like a human, and the degree of freedom is far greater than that of humans. But is this really not a horror movie? At the beginning of the video, Atlas is lying calmly on the ground, seemingly on his back. What follows is jaw-dropping

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks Apr 29, 2024 pm 06:55 PM

I cry to death. The world is madly building big models. The data on the Internet is not enough. It is not enough at all. The training model looks like "The Hunger Games", and AI researchers around the world are worrying about how to feed these data voracious eaters. This problem is particularly prominent in multi-modal tasks. At a time when nothing could be done, a start-up team from the Department of Renmin University of China used its own new model to become the first in China to make "model-generated data feed itself" a reality. Moreover, it is a two-pronged approach on the understanding side and the generation side. Both sides can generate high-quality, multi-modal new data and provide data feedback to the model itself. What is a model? Awaker 1.0, a large multi-modal model that just appeared on the Zhongguancun Forum. Who is the team? Sophon engine. Founded by Gao Yizhao, a doctoral student at Renmin University’s Hillhouse School of Artificial Intelligence.

Kuaishou version of Sora 'Ke Ling' is open for testing: generates over 120s video, understands physics better, and can accurately model complex movements Jun 11, 2024 am 09:51 AM

What? Is Zootopia brought into reality by domestic AI? Exposed together with the video is a new large-scale domestic video generation model called "Keling". Sora uses a similar technical route and combines a number of self-developed technological innovations to produce videos that not only have large and reasonable movements, but also simulate the characteristics of the physical world and have strong conceptual combination capabilities and imagination. According to the data, Keling supports the generation of ultra-long videos of up to 2 minutes at 30fps, with resolutions up to 1080p, and supports multiple aspect ratios. Another important point is that Keling is not a demo or video result demonstration released by the laboratory, but a product-level application launched by Kuaishou, a leading player in the short video field. Moreover, the main focus is to be pragmatic, not to write blank checks, and to go online as soon as it is released. The large model of Ke Ling is already available in Kuaiying.

Which version of Apple 16 system is the best? Mar 08, 2024 pm 05:16 PM

The best version of the Apple 16 system is iOS16.1.4. The best version of the iOS16 system may vary from person to person. The additions and improvements in daily use experience have also been praised by many users. Which version of the Apple 16 system is the best? Answer: iOS16.1.4 The best version of the iOS 16 system may vary from person to person. According to public information, iOS16, launched in 2022, is considered a very stable and performant version, and users are quite satisfied with its overall experience. In addition, the addition of new features and improvements in daily use experience in iOS16 have also been well received by many users. Especially in terms of updated battery life, signal performance and heating control, user feedback has been relatively positive. However, considering iPhone14

The U.S. Air Force showcases its first AI fighter jet with high profile! The minister personally conducted the test drive without interfering during the whole process, and 100,000 lines of code were tested for 21 times. May 07, 2024 pm 05:00 PM

Recently, the military circle has been overwhelmed by the news: US military fighter jets can now complete fully automatic air combat using AI. Yes, just recently, the US military’s AI fighter jet was made public for the first time and the mystery was unveiled. The full name of this fighter is the Variable Stability Simulator Test Aircraft (VISTA). It was personally flown by the Secretary of the US Air Force to simulate a one-on-one air battle. On May 2, U.S. Air Force Secretary Frank Kendall took off in an X-62AVISTA at Edwards Air Force Base. Note that during the one-hour flight, all flight actions were completed autonomously by AI! Kendall said - "For the past few decades, we have been thinking about the unlimited potential of autonomous air-to-air combat, but it has always seemed out of reach." However now,

See all articles