Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes-AI-php.cn

Table of Contents

Optimum for Consistency, Quality and Speed

Home

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jan 10, 2024 pm 11:09 PM

image 2d

It only takes two minutes to convert pictures into 3D!

It is still the kind with high texture quality and high consistency in multiple viewing angles.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

No matter what species it is, the single-view image when input is still like this:

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

Two minutes later , the 3D version is done:

##△Top, Repaint123 (

NeRF); Bottom, Repaint123 (GS)

The new method is called

Repaint123. The core idea is to combine the powerful image generation capability of the 2D diffusion model with the texture alignment capability of the repaint strategy to generate high-quality, consistent images from multiple perspectives.

In addition, this research also introduces a visibility-aware adaptive repaint intensity method for overlapping areas.

Repaint123 solves the problems of previous methods such as large multi-view deviation, texture degradation, and slow generation in one fell swoop.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

The project code has not yet been published on GitHub, but 100 people have come to mark the code:

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

What does Repaint123 look like?

Previously, the method of converting images to 3D usually used Score Distillation Sampling (SDS). Although the results of this method are impressive, there are some issues such as multi-view inconsistency, over-saturation, over-smoothed textures, and slow generation.

△From top to bottom: input, Zero123-XL, Magic123, Dream gaussian

In order to solve these problems, from Peking University and Pengcheng Laboratory Researchers from , National University of Singapore, and Wuhan University proposed Repaint123.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

In general, Repaint123 has the following contributions:

(1) Repaint123 generates a controllable redrawing process from images to 3D by comprehensively considering it , able to generate high-quality image sequences and ensure that these images are consistent across multiple viewing angles.

(2) Repaint123 proposed a simple baseline method for single-view 3D generation.

In the rough model stage, it uses Zero123 as the 3D prior, combined with the SDS loss function, to quickly generate a rough 3D model (only 1 minute) by optimizing the Gaussian Splatting geometry.

In the fine model stage, it uses Stable Diffusion as the 2D prior, combined with the mean square error (MSE) loss function, to generate high-quality 3D models by quickly refining the mesh texture (also only 1 minute).

(3) A large number of experiments have proven the effectiveness of the Repaint123 method. It is able to generate high-quality 3D content that matches 2D generation quality from a single image in just 2 minutes.

△Achieve 3D consistent and high-quality single-view 3D rapid generation

Let’s look at the specific methods.

Repaint123 focuses on optimizing the mesh refinement stage, and its main improvement directions cover two aspects: generating high-quality image sequences with multi-view consistency and achieving fast and high-quality 3D reconstruction.

1. Generating a high-quality image sequence with multi-view consistency

Generating a high-quality image sequence with multi-view consistency is divided into the following three parts:

△Consistent image generation process from multiple perspectives

DDIM inversion

In order to retain the generation in the rough model stage To obtain consistent 3D low-frequency texture information, the author uses DDIM inversion to invert the image into a determined latent space, laying the foundation for the subsequent denoising process and generating faithful and consistent images.

Controllable denoising

In order to control the geometric consistency and long-range texture consistency in the denoising stage, the author introduced ControlNet, using the depth map rendered by the coarse model as a geometric prior, and at the same time injecting the Attention feature of the reference map for texture migration.

In addition, in order to perform classifier-free guidance to improve image quality, the paper uses CLIP to encode reference images into image cues for guiding the denoising network.

Redraw

Progressive redrawing of occlusions and overlapping portions To ensure that overlapping areas of adjacent images in an image sequence are aligned at the pixel level, the author uses progressive local Redraw strategy.

While keeping overlapping areas unchanged, harmonious adjacent areas are generated and gradually extend to 360° from the reference perspective.

However, as shown in the figure below, the author found that the overlapping area also needs to be refined, because the visual resolution of the previously strabismused area becomes larger during emmetropia, and more high-frequency information needs to be added.

In addition, the thinning intensity is equal to 1-cosθ*, where θ* is the maximum value of the angle θ between all previous camera angles and the normal vector of the viewed surface, Thereby adaptively redrawing overlapping areas.

△The relationship between camera angle and thinning intensity

In order to choose the appropriate thinning intensity to ensure fidelity while improving quality, the author draws lessons from Based on the projection theorem and the idea of image super-resolution, a simple and direct visibility-aware redrawing strategy is proposed to refine the overlapping areas.

2. Fast and high-quality 3D reconstruction

As shown in the figure below, the author uses two methods in the process of fast and high-quality 3D reconstruction. stage approach.

△Repaint123 two-stage single-view 3D generation framework

First, they utilize Gaussian Splatting representation to quickly generate reasonable geometric structures and rough textures.

At the same time, with the help of the previously generated multi-view consistent high-quality image sequence, the author is able to use a simple mean square error (MSE) loss for fast 3D texture reconstruction.

Optimum for Consistency, Quality and Speed

Researchers compared multiple approaches for single-view generation tasks.

△Single-view 3D generation visualization comparison

On RealFusion15 and Test-alpha data sets, Repaint123 achieved three results in consistency, quality and speed. The most advanced effect in terms of performance.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

At the same time, the author also conducted ablation experiments on the effectiveness of each module used in the paper and the increment of perspective rotation:

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

It was also found that when the viewing angle interval is 60 degrees, the performance reaches the peak, but an excessive viewing angle interval will reduce the overlapping area and increase the possibility of multi-faceted problems, so 40 degrees can be used as the optimal viewing angle interval.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

Paper address: https://arxiv.org/pdf/2312.13271.pdf
Code address: https:// pku-yuangroup.github.io/repaint123/
Project address: https://pku-yuangroup.github.io/repaint123/

The above is the detailed content of Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

3 weeks ago By DDD

Saving in R.E.P.O. Explained (And Save Files)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

4 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7572

CakePHP Tutorial

1386

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

110

Related knowledge

How to clear desktop background recent image history in Windows 11 Apr 14, 2023 pm 01:37 PM

<p>Windows 11 improves personalization in the system, allowing users to view a recent history of previously made desktop background changes. When you enter the personalization section in the Windows System Settings application, you can see various options, changing the background wallpaper is one of them. But now you can see the latest history of background wallpapers set on your system. If you don't like seeing this and want to clear or delete this recent history, continue reading this article, which will help you learn more about how to do it using Registry Editor. </p><h2>How to use registry editing

How to Download Windows Spotlight Wallpaper Image on PC Aug 23, 2023 pm 02:06 PM

Windows are never one to neglect aesthetics. From the bucolic green fields of XP to the blue swirling design of Windows 11, default desktop wallpapers have been a source of user delight for years. With Windows Spotlight, you now have direct access to beautiful, awe-inspiring images for your lock screen and desktop wallpaper every day. Unfortunately, these images don't hang out. If you have fallen in love with one of the Windows spotlight images, then you will want to know how to download them so that you can keep them as your background for a while. Here's everything you need to know. What is WindowsSpotlight? Window Spotlight is an automatic wallpaper updater available from Personalization &gt in the Settings app

How to use image semantic segmentation technology in Python? Jun 06, 2023 am 08:03 AM

With the continuous development of artificial intelligence technology, image semantic segmentation technology has become a popular research direction in the field of image analysis. In image semantic segmentation, we segment different areas in an image and classify each area to achieve a comprehensive understanding of the image. Python is a well-known programming language. Its powerful data analysis and data visualization capabilities make it the first choice in the field of artificial intelligence technology research. This article will introduce how to use image semantic segmentation technology in Python. 1. Prerequisite knowledge is deepening

iOS 17: How to use one-click cropping in photos Sep 20, 2023 pm 08:45 PM

With the iOS 17 Photos app, Apple makes it easier to crop photos to your specifications. Read on to learn how. Previously in iOS 16, cropping an image in the Photos app involved several steps: Tap the editing interface, select the crop tool, and then adjust the crop using a pinch-to-zoom gesture or dragging the corners of the crop tool. In iOS 17, Apple has thankfully simplified this process so that when you zoom in on any selected photo in your Photos library, a new Crop button automatically appears in the upper right corner of the screen. Clicking on it will bring up the full cropping interface with the zoom level of your choice, so you can crop to the part of the image you like, rotate the image, invert the image, or apply screen ratio, or use markers

New perspective on image generation: discussing NeRF-based generalization methods Apr 09, 2023 pm 05:31 PM

New perspective image generation (NVS) is an application field of computer vision. In the 1998 SuperBowl game, CMU's RI demonstrated NVS given multi-camera stereo vision (MVS). At that time, this technology was transferred to a sports TV station in the United States. , but it was not commercialized in the end; the British BBC Broadcasting Company invested in research and development for this, but it was not truly commercialized. In the field of image-based rendering (IBR), there is a branch of NVS applications, namely depth image-based rendering (DBIR). In addition, 3D TV, which was very popular in 2010, also needed to obtain binocular stereoscopic effects from monocular video, but due to the immaturity of the technology, it did not become popular in the end. At that time, methods based on machine learning had begun to be studied, such as

Use 2D images to create a 3D human body. You can wear any clothes and change your movements. Apr 11, 2023 pm 02:31 PM

Thanks to the differentiable rendering provided by NeRF, recent 3D generative models have achieved stunning results on stationary objects. However, in a more complex and deformable category such as the human body, 3D generation still poses great challenges. This paper proposes an efficient combined NeRF representation of the human body, enabling high-resolution (512x256) 3D human body generation without the use of super-resolution models. EVA3D has significantly surpassed existing solutions on four large-scale human body data sets, and the code has been open source. Paper name: EVA3D: Compositional 3D Human Generation from 2D image Collections Paper address: http

How to batch resize images using PowerToys on Windows Aug 23, 2023 pm 07:49 PM

Those who have to work with image files on a daily basis often have to resize them to fit the needs of their projects and jobs. However, if you have too many images to process, resizing them individually can consume a lot of time and effort. In this case, a tool like PowerToys can come in handy to, among other things, batch resize image files using its image resizer utility. Here's how to set up your Image Resizer settings and start batch resizing images with PowerToys. How to Batch Resize Images with PowerToys PowerToys is an all-in-one program with a variety of utilities and features to help you speed up your daily tasks. One of its utilities is images

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy's high-definition portrait skin beauty model ABPN Apr 12, 2023 pm 12:25 PM

With the vigorous development of the digital culture industry, artificial intelligence technology has begun to be widely used in the field of image editing and beautification. Among them, portrait skin beautification is undoubtedly one of the most widely used and most demanded technologies. Traditional beauty algorithms use filter-based image editing technology to achieve automated skin resurfacing and blemish removal effects, and have been widely used in social networking, live broadcasts and other scenarios. However, in the professional photography industry with high thresholds, due to the high requirements for image resolution and quality standards, manual retouchers are still the main productive force in portrait beauty retouching, completing tasks including skin smoothing, blemish removal, whitening, etc. Series work. Usually, the average processing time for a professional retoucher to perform skin beautification operations on a high-definition portrait is 1-2 minutes. In fields such as advertising, film and television, which require higher precision, this

See all articles