ACM MM 2023 | DiffBFR: Noise suppression face restoration method jointly proposed by Meitu & Chinese University of Science and Technology-AI-php.cn

The goal of Blind Face Restoration (BFR) is to restore high-quality face images from low-quality face images. This is an important task in the field of computer vision and graphics, and is widely used in various scenarios such as surveillance image restoration, old photo restoration, and facial image super-resolution.

However, this task is very challenging. nature, because the degradation of uncertainty will damage the quality of the image and even lead to the loss of image information, such as blur, noise, downsampling and compression artifacts. Previous BFR methods usually rely on generative adversarial networks (GAN) to solve these problems by designing various face-specific priors, including generative priors, reference priors, and geometric priors. Although these methods have reached the state-of-the-art level, they still cannot fully achieve the goal of obtaining realistic textures while restoring details

In the image restoration process, the datasets of face images are usually scattered in high-dimensional space , and the characteristic dimension of the distribution takes the form of a long-tail distribution. Different from the long-tail distribution of image classification tasks, the long-tail regional features in image restoration refer to attributes that have a small impact on identity but a large impact on visual effects, such as moles, wrinkles, and tones, etc.

According to the simplicity shown in Figure 1, in order not to change the original meaning, the experimental results need to be rewritten into Chinese. We can find that the past GAN-based method has obvious problems when processing the head and tail samples of long-tail distribution at the same time. Repair the image Over-smoothing and loss of detail may occur. The method based on Diffusion Probistic Models (DPM) can better fit the long-tail distribution and retain the tail characteristics while fitting the real data distribution

ACM MM 2023 | DiffBFR: 美图&国科大联合提出的噪音抑制人脸修复方法

The content that needs to be rewritten is: GAN-based and DPM-based testing on long-tail issues

Meitu Imaging Research Institute (MT Lab) and the Chinese Academy of Sciences University researchers jointly proposed a new blind face image repair method, DiffBFR, which is based on DPM technology and successfully restored blind face images, repairing low-quality (LQ) face images to high-quality (HQ). A clear image of

ACM MM 2023 | DiffBFR: 美图&国科大联合提出的噪音抑制人脸修复方法

What needs to be rewritten is: Paper link: https://arxiv.org/abs/2305.04517

This study explores The adaptability of two generative models, Generative Adversarial Network (GAN) and Deep Partial Model (DPM), in dealing with long-tail problems. By designing an appropriate face restoration module, more accurate detailed information can be obtained, thereby reducing the over-smoothing of the face that may occur in generative methods and improving the precision and accuracy of restoration. This research paper has been accepted by ACM MM 2023

DPM-based blind face image repair method - DiffBFR

The study found that the diffusion model is good at avoiding training mode collapse and fitting It is better than the GAN method in generating long-tail distribution. Therefore, DiffBFR chooses to use the diffusion probability model to enhance the embedding of face prior information, and uses this as the basic framework to choose DPM as the solution. This is because the diffusion model has the powerful ability to produce high-quality images within any distribution range

In order to solve the long-tail distribution of features on the face dataset found in the paper and the over-smoothing problem based on GAN methods in the past, This study explores a reasonable design to better fit the approximate long-tail distribution and overcome the over-smoothing problem in the repair process. Through a simple experiment of GAN and DPM with the same parameter size on the MNIST data set (Figure 1), the study found that the DPM method can reasonably fit the long-tail distribution, while GAN pays too much attention to the head features and ignores the tail features. As a result, tail features cannot be generated. Therefore, DPM is chosen as a solution to BFR

By introducing two intermediate variables, DiffBFR proposes two specific repair modules. The design adopts a two-stage approach, first recovering identity information from LQ images, and then enhancing texture details based on the distribution of real faces. This design consists of two key parts:

(1) Identity Restoration Module (IRM):

The purpose of this module is to retain the Face details. At the same time, a truncated sampling method is proposed, which replaces the denoising method using pure Gaussian random distribution in the reverse process by adding part of the noise to the low-quality image. The paper theoretically proves that this change shrinks the theoretical evidence lower bound (ELBO) of DPM, thereby restoring more original details. Based on theoretical proofs, two cascaded conditional diffusion models with different input sizes are introduced to enhance the sampling effect and reduce the training difficulty of directly generating high-resolution images. At the same time, it is further proved that the higher the quality of the conditional input, the closer it is to the real data distribution, and the more accurate the restored image is. This is also the reason why DiffBFR first restores low-resolution images

(2) Texture Enhancement Module (TEM):

The method used to texture polish images is to introduce an unconditional diffusion model. This model is completely independent of low-quality images, further making the restored results closer to real image data. The paper theoretically proves that an unconditional diffusion model trained on purely high-quality images contributes to the correct distribution of the output image in pixel-level space. That is, after using this model, the distribution of inpainted images has a lower FID than before using it, and is overall more similar to the distribution of high-quality images. Specifically, the identity information is retained by truncating the sampling at the time step, and the pixel-level texture is polished.

The sampling inference steps of DiffBFR are shown in Figure 2, and the schematic diagram of the sampling inference process is shown in Figure 3

ACM MM 2023 | DiffBFR: 美图&国科大联合提出的噪音抑制人脸修复方法

The content that needs to be rewritten is: Figure 2 shows the sampling inference steps of the DiffBFR method

ACM MM 2023 | DiffBFR: 美图&国科大联合提出的噪音抑制人脸修复方法

The content that needs to be rewritten is: Figure 3 shows the schematic diagram of the sampling inference process of the DiffBFR method

In order not to change the original meaning, the experimental results need to be rewritten into Chinese

ACM MM 2023 | DiffBFR: 美图&国科大联合提出的噪音抑制人脸修复方法

Compare the visualization effects of the GAN-based BFR method and the DPM-based method, as shown in Figure 4

ACM MM 2023 | DiffBFR: 美图&国科大联合提出的噪音抑制人脸修复方法

For Figure 5, the performance of the SOTA method for BFR is compared

ACM MM 2023 | DiffBFR: 美图&国科大联合提出的噪音抑制人脸修复方法

The performance of the BFR method The comparison of visualization effects is shown in Figure 6

ACM MM 2023 | DiffBFR: 美图&国科大联合提出的噪音抑制人脸修复方法

In the model, we can compare the performance of IRM and TEM through visualization

ACM MM 2023 | DiffBFR: 美图&国科大联合提出的噪音抑制人脸修复方法

In the model, the performance of IRM and TEM is compared, as shown in Figure 8

ACM MM 2023 | DiffBFR: 美图&国科大联合提出的噪音抑制人脸修复方法

What needs to be rewritten is: Compare the IRM performance of Figure 9 under different parameters

ACM MM 2023 | DiffBFR: 美图&国科大联合提出的噪音抑制人脸修复方法

For Figure 10, we need to compare the different Performance of parameters

ACM MM 2023 | DiffBFR: 美图&国科大联合提出的噪音抑制人脸修复方法

The content that needs to be rewritten is: Figure 11 shows the parameter settings of each module of DiffBFR

Summary is the process of re-expressing information or ideas in a concise and clear way. It does not change the original meaning but presents the same idea by using different vocabulary and sentence structure. The purpose of a summary is to provide a clearer, more concise presentation so that readers can more easily understand and digest the information conveyed. Summarizations are useful in a variety of situations, whether in academic papers, business reports, or everyday communications, where they can be used to convey important ideas and conclusions. In short, summary is an important communication tool that can help us convey and understand information more effectively

This paper proposes a blind degraded face image restoration model DiffBFR based on the diffusion model to Solve the problems of training model collapse and long tail disappearance based on previous GAN methods. By embedding prior knowledge into the diffusion model, high-quality and clear restored images can be generated from random severely degraded face images. Specifically, this study proposes two modules, IRM and TEM, which are used to restore reality and restore details respectively. Through theoretical derivation and experimental image demonstration, the superiority of the model is proven, and qualitative and quantitative comparisons are made with existing state-of-the-art methods

The content that needs to be rewritten is: Research Team

This paper was jointly proposed by researchers from Meitu Imaging Research Institute (MT Lab) and the University of Chinese Academy of Sciences. Meitu Imaging Research Institute (MT Lab) was established in 2010. It is a team of Meitu focusing on algorithm research, engineering development and product implementation in the fields of computer vision, deep learning, augmented reality and other fields. Since its establishment, the team has been committed to exploring research in the field of computer vision, and began deploying deep learning in 2013 to provide technical support for Meitu's software and hardware products. At the same time, they also provide targeted SaaS services for multiple vertical fields in the imaging industry, and promote the ecological development of Meitu's artificial intelligence products through cutting-edge imaging technology. They have participated in top international competitions such as CVPR, ICCV, and ECCV, won more than ten championships and runner-ups, and published more than 48 top international academic conference papers. Meitu Imaging Research Institute (MT Lab) has long been committed to research and development in the imaging field, has accumulated rich technical reserves, and has rich technology implementation experience in the fields of pictures, videos, design and digital people

The above is the detailed content of ACM MM 2023 | DiffBFR: Noise suppression face restoration method jointly proposed by Meitu & Chinese University of Science and Technology. For more information, please follow other related articles on the PHP Chinese website!