Home Technology peripherals AI CVPR 2024 high-scoring paper: New generative editing framework GenN2N, unifying NeRF conversion tasks

CVPR 2024 high-scoring paper: New generative editing framework GenN2N, unifying NeRF conversion tasks

Apr 19, 2024 pm 09:40 PM
git project genn2n

CVPR 2024高分论文:全新生成式编辑框架GenN2N,统一NeRF转换任务

#The AIxiv column of our website is a column about academic and technical content. In the past few years, the AIxiv column on our website has received more than 2,000 pieces of content, covering top laboratories from major universities and companies around the world, helping to promote academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. The submission email address is liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com.


Researchers from Hong Kong University of Science and Technology and Tsinghua University proposed "GenN2N", a unified generative NeRF-to-NeRF conversion framework. Suitable for various NeRF conversion tasks, such as text-driven NeRF editing, coloring, super-resolution, repair, etc., with extremely excellent performance! CVPR 2024高分论文:全新生成式编辑框架GenN2N,统一NeRF转换任务

CVPR 2024高分论文:全新生成式编辑框架GenN2N,统一NeRF转换任务

  • Paper address: https://arxiv.org/abs/2404.02788
  • Paper homepage: https://xiangyueliu.github.io/GenN2N/
  • Github address: https://github.com/Lxiangyue/GenN2N
  • Paper title: GenN2N: Generative NeRF2NeRF Translation

In recent years, Neural Radiation Fields (NeRF) have become popular due to their compactness ,high quality and versatility have attracted widespread attention ,in the fields of 3D reconstruction, 3D generation and ,new perspective synthesis. However, once a NeRF scene is created, these methods often lack further control over the resulting geometry and appearance. Therefore, NeRF Editing has recently become a research focus worthy of attention.

Current NeRF editing methods are usually task-specific, such as text-driven editing of NeRF, super-resolution, repair, and colorization. These methods require a large amount of task-specific domain knowledge. In the field of 2D image editing, it has become a trend to develop universal image-to-image conversion methods. For example, the 2D generative model Stable Difussion is used to support multi-functional image editing. Therefore, we propose universal NeRF editing utilizing underlying 2D generative models.

A challenge that comes with this is the representation gap between NeRF and 2D images, especially since image editors often generate multiple inconsistent edits for different viewpoints. A recent text-based NeRF editing method, Instruct-NeRF2NeRF, explores this. It adopts the "rendering-editing-aggregation" process to gradually update the NeRF scene by gradually rendering multi-view images, editing these images, and aggregating the edited images into NeRF. However, this editing method, after a lot of optimization for specific editing needs, can only generate a specific editing result. If the user is not satisfied, iterative attempts need to be repeated.

Therefore, we propose "GenN2N", a general NeRF-to-NeRF framework suitable for a variety of NeRF editing tasks. Its core lies in generating This method is used to describe the multi-solution nature of the editing process, so that it can easily generate a large number of editing results that meet the requirements for users to choose with the help of generative editing.

In the core part of GenN2N, 1) the generative framework of 3D VAE-GAN is introduced, using VAE to represent the entire editing space to learn 2D editing with a set of inputs All possible 3D NeRF editing distributions corresponding to the image, and use GAN to provide reasonable supervision for different views of the editing NeRF to ensure the authenticity of the editing results. 2) Use contrastive learning to decouple the editing content and perspective to ensure the editing content between different perspectives. Consistency, 3) During inference, the user simply randomly samples multiple editing codes from the conditional generation model to generate various 3D editing results corresponding to the editing target.

Compared with SOTA methods for various NeRF editing tasks (ICCV2023 Oral, etc.), GenN2N is superior to existing methods in terms of editing quality, diversity, efficiency, etc.

Method introduction

We first perform 2D image editing, and then edit these 2D images Upgrade to 3D NeRF to achieve generative NeRF-to-NeRF conversion.

CVPR 2024高分论文:全新生成式编辑框架GenN2N,统一NeRF转换任务

A. Latent Distill

We use Latent Distill Module as the encoder of VAE to learn one for each edited image An implicit editing code that controls the generated content during NeRF-to-NeRF conversion. All editing codes obey a good normal distribution under the constraint of KL loss for better sampling. In order to decouple editing content and perspective, we carefully designed comparative learning to encourage the editing codes of pictures with the same editing style but different perspectives to be similar, and the editing codes of pictures with different editing styles but the same perspective to be far away from each other.

B.NeRF-to-NeRF conversion (Translated NeRF)

us NeRF-to-NeRF Translation is used as the decoder of VAE, which takes the editing code as input and modifies the original NeRF into a converted NeRF. We added residual layers between the hidden layers of the original NeRF network. These residual layers use the editing code as input to modulate the hidden layer neurons, so that the converted NeRF can not only retain the original NeRF information, but also control the 3D conversion based on the editing code. content. At the same time, NeRF-to-NeRF Translation also serves as a generator to participate in generative adversarial training. By generating rather than optimizing, we can obtain multiple conversion results at once, significantly improving NeRF conversion efficiency and result diversity.

C. Conditional Discriminator

##Convert NeRF rendering image It constitutes a generation space that needs to be judged. The editing styles and rendering perspectives of these pictures are different, making the generation space very complex. Therefore we provide a condition as additional information for the discriminator. Specifically, when the discriminator identifies the generator's rendered picture
(negative sample) or the edited picture CVPR 2024高分论文:全新生成式编辑框架GenN2N,统一NeRF转换任务 (positive sample) in the training data, we select an edited picture of the same perspective from the training data Picture CVPR 2024高分论文:全新生成式编辑框架GenN2N,统一NeRF转换任务 is used as a condition, which prevents the discriminator from being interfered by perspective factors when distinguishing positive and negative samples. CVPR 2024高分论文:全新生成式编辑框架GenN2N,统一NeRF转换任务

D. Inference

After GenN2N optimization, users can Randomly sample the editing code from the normal distribution, and input the converted NeRF to generate an edited high-quality, multi-view consistent 3D NeRF scene.

Experiments

We conducted on various NeRF-to-NeRF tasks Extensive experiments including NeRF text-driven editing, colorization, super-resolution, inpainting, and more. Experimental results demonstrate GenN2N’s superior editing quality, multi-view consistency, generated diversity, and editing efficiency.

A. Text-based NeRF editingCVPR 2024高分论文:全新生成式编辑框架GenN2N,统一NeRF转换任务B.NeRF coloring CVPR 2024高分论文:全新生成式编辑框架GenN2N,统一NeRF转换任务C.NeRF Super Resolution CVPR 2024高分论文:全新生成式编辑框架GenN2N,统一NeRF转换任务D.NeRF Repair CVPR 2024高分论文:全新生成式编辑框架GenN2N,统一NeRF转换任务
Comparative experiments

Our method is qualitatively and quantitatively compared with SOTA methods for various specific NeRF tasks (including text-driven editing, coloring , super-resolution and restoration, etc.). The results show that GenN2N, as a general framework, performs as well as or better than task-specific SOTA, while the editing results have greater diversity (the following is a comparison between GenN2N and Instruct-NeRF2NeRF on the text-based NeRF editing task).

A. Text-based NeRF EditorCVPR 2024高分论文:全新生成式编辑框架GenN2N,统一NeRF转换任务
Learn more about experiments and methods , please refer to the paper homepage.

Team introduction

This paper comes from the Tan Ping team of Hong Kong University of Science and Technology and Tsinghua University 3DVICI Lab, Shanghai Artificial Intelligence Laboratory and Shanghai Qizhi Research Institute. The authors of the paper are Liu Xiangyue, a student of Hong Kong University of Science and Technology, Xue Han, a student of Tsinghua University, and Luo Kunming, a student of Hong Kong University of Science and Technology. The instructors are Professor Yi Li of Tsinghua University and Hong Kong Science and Technology Teacher Tan Ping from the university.

The above is the detailed content of CVPR 2024 high-scoring paper: New generative editing framework GenN2N, unifying NeRF conversion tasks. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to run the h5 project How to run the h5 project Apr 06, 2025 pm 12:21 PM

Running the H5 project requires the following steps: installing necessary tools such as web server, Node.js, development tools, etc. Build a development environment, create project folders, initialize projects, and write code. Start the development server and run the command using the command line. Preview the project in your browser and enter the development server URL. Publish projects, optimize code, deploy projects, and set up web server configuration.

Does H5 page production require continuous maintenance? Does H5 page production require continuous maintenance? Apr 05, 2025 pm 11:27 PM

The H5 page needs to be maintained continuously, because of factors such as code vulnerabilities, browser compatibility, performance optimization, security updates and user experience improvements. Effective maintenance methods include establishing a complete testing system, using version control tools, regularly monitoring page performance, collecting user feedback and formulating maintenance plans.

Can you learn how to make H5 pages by yourself? Can you learn how to make H5 pages by yourself? Apr 06, 2025 am 06:36 AM

It is feasible to self-study H5 page production, but it is not a quick success. It requires mastering HTML, CSS, and JavaScript, involving design, front-end development, and back-end interaction logic. Practice is the key, and learn by completing tutorials, reviewing materials, and participating in open source projects. Performance optimization is also important, requiring optimization of images, reducing HTTP requests and using appropriate frameworks. The road to self-study is long and requires continuous learning and communication.

How to view the results after Bootstrap is modified How to view the results after Bootstrap is modified Apr 07, 2025 am 10:03 AM

Steps to view modified Bootstrap results: Open the HTML file directly in the browser to ensure that the Bootstrap file is referenced correctly. Clear the browser cache (Ctrl Shift R). If you use CDN, you can directly modify CSS in the developer tool to view the effects in real time. If you modify the Bootstrap source code, download and replace the local file, or rerun the build command using a build tool such as Webpack.

How to use vue pagination How to use vue pagination Apr 08, 2025 am 06:45 AM

Pagination is a technology that splits large data sets into small pages to improve performance and user experience. In Vue, you can use the following built-in method to paging: Calculate the total number of pages: totalPages() traversal page number: v-for directive to set the current page: currentPage Get the current page data: currentPageData()

HadiDB: A lightweight, horizontally scalable database in Python HadiDB: A lightweight, horizontally scalable database in Python Apr 08, 2025 pm 06:12 PM

HadiDB: A lightweight, high-level scalable Python database HadiDB (hadidb) is a lightweight database written in Python, with a high level of scalability. Install HadiDB using pip installation: pipinstallhadidb User Management Create user: createuser() method to create a new user. The authentication() method authenticates the user's identity. fromhadidb.operationimportuseruser_obj=user("admin","admin")user_obj.

Monitor MySQL and MariaDB Droplets with Prometheus MySQL Exporter Monitor MySQL and MariaDB Droplets with Prometheus MySQL Exporter Apr 08, 2025 pm 02:42 PM

Effective monitoring of MySQL and MariaDB databases is critical to maintaining optimal performance, identifying potential bottlenecks, and ensuring overall system reliability. Prometheus MySQL Exporter is a powerful tool that provides detailed insights into database metrics that are critical for proactive management and troubleshooting.

How to view the JavaScript behavior of Bootstrap How to view the JavaScript behavior of Bootstrap Apr 07, 2025 am 10:33 AM

The JavaScript section of Bootstrap provides interactive components that give static pages vitality. By looking at the open source code, you can understand how it works: Event binding triggers DOM operations and style changes. Basic usage includes the introduction of JavaScript files and the use of APIs, and advanced usage involves custom events and extension capabilities. Frequently asked questions include version conflicts and CSS style conflicts, which can be resolved by double-checking the code. Performance optimization tips include on-demand loading and code compression. The key to mastering Bootstrap JavaScript is to understand its design concepts, combine practical applications, and use developer tools to debug and explore.

See all articles