In recent years, the virtual digital human industry has exploded, and all walks of life are launching their own digital human images. There is no doubt that high-fidelity 3D hair models can significantly enhance the realism of virtual digital humans. Unlike other parts of the human body, describing and extracting hair structure is more challenging due to the extremely complex nature of the intertwined hair structure, making it extremely difficult to reconstruct a high-fidelity 3D hair model from just a single view. Generally speaking, existing methods solve this problem in two steps: first estimating a 3D orientation field based on the 2D orientation map extracted from the input image, and then synthesizing hair strands based on the 3D orientation field. However, this mechanism still has some problems in practice.
Based on observations in practice, researchers are seeking a fully automated and efficient hair model modeling method that can reconstruct a 3D hair model from a single image with fine-grained features (Figure 1), while Showing a high degree of flexibility, e.g. reconstructing a hair model requires only one forward pass of the network.
In order to solve these problems, researchers from Zhejiang University, ETH Zurich, Switzerland, and City University of Hong Kong proposed IRHairNet, which implements a rough Develop sophisticated strategies to generate high-fidelity 3D orientation fields. Specifically, they introduced a novel voxel-aligned implicit function (VIFu) to extract information from the 2D orientation map of the rough module. At the same time, in order to make up for the local details lost in the 2D direction map, the researchers used the high-resolution brightness map to extract local features and combined them with the global features in the fine module for high-fidelity hair styling.
In order to effectively synthesize hair models from 3D directional fields, researchers introduced GrowingNet, a hair growth method based on deep learning using local implicit grid representation. This is based on a key observation: although the geometry and growth direction of hairs differ globally, they have similar characteristics at specific local scales. Therefore, a high-level latent code can be extracted for each local 3D orientation patch, and then a neural latent function (a decoder) is trained to grow hair strands in it based on this latent code. After each growth step, a new local patch centered on the end of the hair strand is used to continue growing. After training, it can be applied to 3D oriented fields at any resolution.
Paper: https://arxiv.org/pdf/2205.04175.pdf
IRHairNet and GrowingNet form the core of NeuralHDHair. Specifically, the main contributions of this research include:
Figure 2 shows the pipeline of NeuralHDHair. For a portrait image, its 2D orientation map is first calculated and its brightness map is extracted. Additionally, they are automatically aligned to the same bust reference model to obtain bust depth maps. These three graphs are then fed back to IRHairNet.
For more method details, please refer to the original paper.
In this part, the researcher evaluates the effectiveness and necessity of each algorithm component through ablation studies (Section 4.1), and then combines the methods in this paper Compare with current SOTA (Section 4.2). Implementation details and more experimental results can be found in the supplementary material.
The researchers evaluated the fidelity and efficiency of GrowingNet from a qualitative and quantitative perspective. First, three sets of experiments are conducted on synthetic data: 1) traditional hair growth algorithm, 2) GrowingNet without overlapping potential patch schemes, 3) the complete model of this paper.
As shown in Figure 4 and Table 1, compared with the traditional hair growth algorithm, GrowingNet in this article has obvious advantages in time consumption while maintaining the same growth performance in terms of visual quality. In addition, by comparing the third and fourth columns of Figure 4, it can be seen that if there is no overlapping potential patch scheme, the hair strands at the patch boundary may be discontinuous, which is a problem when the growth direction of the hair strands changes drastically. It's even more serious. However, it is worth noting that this solution greatly improves efficiency at the expense of slightly reducing accuracy. Improving efficiency is of great significance for its convenient and efficient application in human body digitization.
In order to evaluate the performance of NeuralHDHair, the researchers compared Comparisons were made with some SOTA methods [6, 28, 30, 36, 40]. Among them, Autohair is based on a data-driven approach for hair synthesis, while HairNet [40] ignores the hair growth process to achieve end-to-end hair modeling. In contrast, [28,36] implement a two-step strategy by first estimating a 3D orientation field and then synthesizing hair strands from it. PIFuHD [30] is a monocular high-resolution 3D modeling method based on a coarse-to-fine strategy, which can be used for 3D hair modeling.
As shown in Figure 6, the results of HairNet look unsatisfactory, but the local details and even the overall shape are inconsistent with the hair in the input image. This is because the method synthesizes hair in a simple and crude way, recovering disordered hair strands directly from a single image.
The reconstruction results are also compared with Autohair[6] and Saito[28]. As shown in Figure 7, although Autohair can synthesize realistic results, it does not structurally match the input image well because the database contains limited hairstyles. Saito's results, on the other hand, lack local details and have shapes inconsistent with the input image. In contrast, the results of this method better maintain the global structure and local details of the hair while ensuring the consistency of the hair shape.
PIFuHD [30] and Dynamic Hair [36] are dedicated to estimating high-fidelity 3D hair geometric features to generate realistic hair strands Model. Figure 8 shows two representative comparison results. It can be seen that the pixel-level implicit function used in PIFuHD cannot fully depict complex hair, resulting in a result that is too smooth, has no local details, and does not even have a reasonable global structure. Dynamic Hair can produce more reasonable results with less detail, and the hair growth trend in its results can match the input image well, but many local structural details (such as hierarchy) cannot be captured, especially for complex hairstyles. In contrast, our method can adapt to different hairstyles, even extremely complex structures, and make full use of global features and local details to generate high-fidelity, high-resolution 3D hair models with more details.
The above is the detailed content of It's really so smooth: NeuralHDHair, a new 3D hair modeling method, jointly produced by Zhejiang University, ETH Zurich, and CityU. For more information, please follow other related articles on the PHP Chinese website!