Today, with the rapid development of science and technology, research in the fields of generative artificial intelligence and computer graphics is increasingly attracting attention. Industries such as film and television production and game development are facing huge challenges and opportunities. This article will introduce you to a research in the field of 3D generation - DreamFace, which is the first text-guided progressive 3D generation framework that supports Production-Ready 3D asset generation, enabling text generation-driven 3D hyper-realistic digital people.
This work has been accepted by Transactions on Graphics, the top international journal in the field of computer graphics, and will be presented at SIGGRAPH 2023, the top international conference on computer graphics.
Project website: https://sites.google.com/view/dreamface
Preprint version of the paper: https://arxiv.org/abs/2304.03117
Web Demo : https://hyperhuman.top
##HuggingFace Space:https://huggingface.co/spaces/DEEMOSTECH/ChatAvatar
IntroductionSince the great breakthroughs in text and image generation technology, 3D generation technology has gradually become the focus of scientific research and industry. However, 3D generation technologies currently on the market still face many challenges, including CG pipeline compatibility issues, accuracy issues, and running speed issues.
In order to solve these problems, the R&D team from Yingmo Technology and Shanghai University of Science and Technology proposed a text-guided progressive 3D generation framework - DreamFace. The framework can directly generate 3D assets that comply with CG production standards, with higher accuracy, faster running speed and better CG pipeline compatibility. This article will introduce the main functions of DreamFace in detail and explore its application prospects in film and television production, game development and other industries.
DreamFace Framework Overview ##The DreamFace framework mainly includes three modules: geometry generation, physics-based Material diffusion generation and animation capability generation. These three modules complement each other to achieve an efficient and reliable 3D generation technology.
Geometry generation
##Geometry generation module The core task is to generate a geometric model consistent with textual prompts. DreamFace adopts a selection framework based on CLIP (Contrastive Language-Image Pre-Training), which first selects the best rough geometric model from randomly sampled candidates within the face geometric parameter space, and then sculpts it through the Implicit Diffusion Model (LDM) Geometric details to make the head model more consistent with text cues. Additionally, the framework supports hair style and color generation based on text prompts.Physically based material diffusion generation
The physically based material diffusion generation module is designed to predict facial textures that are consistent with predicted geometry and textual cues. DreamFace first fine-tuned the pre-trained LDM on the large-scale UV material dataset collected to obtain two LDM diffusion models. A joint training scheme is then employed to coordinate two diffusion processes, one for directly denoising UV texture maps and the other for supervised rendered images.
To ensure that the texture maps created do not contain undesirable features or lighting situations, while still maintaining diversity, a cue learning strategy was designed. The team uses two methods to generate high-quality diffuse maps: (1) Prompt Tuning. Unlike hand-crafted domain-specific text cues, DreamFace combines two domain-specific continuous text cues Cd and Cu with corresponding text cues, which will be optimized during U-Net denoiser training to avoid instability and Time-consuming manual writing of prompts. (2) Masking of non-face areas. The LDM denoising process will be additionally constrained by non-face area masks to ensure that the resulting diffuse map does not contain any unwanted elements.
Finally, 4K physically based textures are generated via the super-resolution module for high-quality rendering.
#Animation ability generation
##DreamFace The generated model has animation capabilities. Generate personalized animations by predicting unique deformations and animating the resulting Neutral model. DreamFace's neural facial animation approach delivers finer expression detail and captures performances with fine detail compared to approaches using generic BlendShapes for expression control.Applications and Outlook
##The DreamFace framework has made great achievements in celebrity generation and character generation based on descriptions. Excellent results. Additionally, texture editing using cues and sketches is supported for global editing effects such as aging and makeup. By further combining masks or sketches, various effects can be created such as tattoos, beards, and birthmarks.DreamFace’s progressive generation framework provides an effective solution to complex 3D generation tasks and is expected to Promote more similar research and technological development. In addition, physically based material diffusion generation and animation capability generation will promote the application of 3D generation technology in film and television production, game development and other related industries. Let us wait and see its development and application in the future.
The above is the detailed content of DreamFace: Generate 3D digital human in one sentence?. For more information, please follow other related articles on the PHP Chinese website!