A method for three-dimensional hair reconstruction robust to adaptive gaussian ellipsoid and complex lighting

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By employing an adaptive Gaussian ellipsoid and a robust 3D hair reconstruction method under complex lighting conditions, the robustness and accuracy issues of 3D hair reconstruction under non-ideal lighting are resolved. This method achieves efficient and highly adaptable hair-level reconstruction, which is applicable to mainstream graphics engines.

CN122289553APending Publication Date: 2026-06-26TIANJIN UNIV

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: TIANJIN UNIV
Filing Date: 2026-04-02
Publication Date: 2026-06-26

Application Information

Patent Timeline

02 Apr 2026

Application

26 Jun 2026

Publication

CN122289553A

IPC: G06T17/00; G06N3/09; G06T15/50; G06N5/04

AI Tagging

Technology Topics

Algorithm Computer graphics

Technical Efficacy Phrases

Improve rebuild speedNo human intervention required

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

A sample height acquisition method and system based on line-spectral confocal profilometry
CN117495889BReduce FWHMImprove peak positioning accuracyImage pair Grayscale
A pin feeding and pressing device
CN224424846URealize loading automationImprove efficiency Machine Physics
A helium recovery and purification device with low loss
CN224404663Uquick clearAvoid interrupting the purification processLinear arrays Helium gas
Multi-agent material computation consistency calibration method
CN122266571Aavoid defects Improve efficiency Program initiation/switching Interprogram communication Theoretical computer science Computational model
A sewage treatment plant influent water quality prediction method based on impact trigger re-fitting
CN122310381AAvoid offline fittingAvoid disconnection in online applicationsEnvironmental resource management Water quality

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN122289553A_ABST

Patent Text Reader

Abstract

This invention discloses a robust 3D hair reconstruction method with adaptive Gaussian ellipsoids and complex lighting, relating to the fields of 3D vision and computer graphics. Addressing the problems of sparse Gaussian ellipsoid distribution, unreliable orientation supervision signals, and geometric and texture coupling interference in existing multi-view hair reconstruction methods under non-ideal lighting, this invention employs a two-stage approach. It utilizes anisotropic Gaussian units to implicitly encode hair orientation and directly transfers the photometric constraint gradient to hair nodes through hair-aligned Gaussian dual representation, achieving efficient and high-precision hair-level reconstruction. Simultaneously, an adaptive lighting preprocessing flow is constructed, using a visual language model to locate low-quality viewpoints and dynamically setting the upper limit of Gaussian density. A three-stage adaptive density strategy is used to fill geometric gaps. Decoupled two-stage optimization is employed to eliminate lighting bias and restore true hair color. Furthermore, confidence-weighted orientation loss and 3D interpolation-guided upsampling are used to improve the orientation field quality and constraint coverage.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of 3D vision and computer graphics technology, and relates to a 3D hair reconstruction method robust to adaptive Gaussian ellipsoids and complex lighting. Specifically, it relates to a method for reconstructing a 3D hair model with hair-level accuracy from a multi-view image sequence under ideal and non-ideal lighting conditions. Background Technology

[0002] With the metaverse being incorporated into the key development direction of the future information field, high-fidelity digital human modeling has become a crucial breakthrough. Among these, 3D hair reconstruction, as the core of digital human appearance features, aims to recover the 3D geometry and appearance of hair with hair-level precision from multi-view image sequences. It is widely used in film and television special effects, game development, virtual makeup try-on, and digital cultural heritage preservation. Currently, although related technologies have made some progress, they still face many bottlenecks in practical applications, and a mature solution adaptable to complex scenarios has not yet been formed.

[0003] The core shortcomings of existing 3D hair reconstruction technologies lie in the accuracy and efficiency of their implementation. Regarding the construction of the 3D orientation field, current methods generally rely on 2D orientation maps calculated using Gabor filters as a source of supervision. However, these maps are susceptible to interference from complex factors such as specular highlights, shadows, and hair occlusion, resulting in limited accuracy and inherent ambiguity. This makes it difficult to accurately distinguish directional information, leading to an inability to meet the demands of detailed hair reconstruction. In terms of differentiable rendering and reconstruction efficiency, existing technologies either convert hair strands into fixed-width quadrilaterals for soft rasterization, failing to accurately simulate the realistic appearance of hair in arbitrary spatial orientations and reducing the accuracy of photometric constraints in geometric supervision; or the reconstruction process based on NeuS volumetric rendering is excessively time-consuming, requiring approximately three days for a single object reconstruction, severely hindering practical deployment.

[0004] Besides issues with accuracy and efficiency, existing technologies suffer from severely insufficient robustness in reconstruction under non-ideal lighting conditions and numerous derivative defects. In low-light, overexposed, or backlit scenes, the Gaussian ellipsoid distribution in 3DGS is sparse, making the supervision signal from the 2D radiation pattern unreliable. Furthermore, color deviations introduced by illumination enhancement operations cause interference between geometric optimization and texture modeling, making it difficult to maintain high-quality reconstruction results. In addition, existing technologies also suffer from reliance on external prior decoders, insufficient geometric accuracy for complex hairstyles, insufficient pseudo-ground value supervision signals under non-ideal lighting, inability to resolve geometric holes in end-to-end reconstruction, and shape ambiguity in single-view methods due to insufficient input information. These defects collectively limit the expansion of 3D hair reconstruction technology to complex and realistic scenes. Summary of the Invention

[0005] (a) The technical problem to be solved by the present invention:

[0006] (1) The robustness of 3D hair reconstruction under non-ideal lighting conditions is insufficient; Existing technologies suffer from problems such as sparse Gaussian ellipsoid distribution and unreliable two-dimensional hair direction pattern supervision signals in non-ideal lighting scenarios such as low light, overexposure, and backlight. At the same time, the color deviation introduced by the lighting enhancement operation can cause mutual coupling and interference between geometric optimization and texture modeling, making it difficult to maintain high-quality hair reconstruction results under complex lighting conditions. It is also prone to derivative problems such as geometric holes and hair color restoration distortion.

[0007] (2) It is difficult to balance accuracy and efficiency in three-dimensional hair reconstruction, and its adaptability is poor. Existing technologies either suffer from insufficient expression of hair geometric details and low accuracy of photometric constraint supervision due to the use of soft rasterization and volumetric rendering, or the reconstruction process is too time-consuming, which restricts practical deployment; or they rely on external prior decoders, which have insufficient geometric accuracy under complex hairstyles. At the same time, single-view methods have shape ambiguity and lack the ability to adapt to different lighting conditions and hairstyle complexity, resulting in high costs for manual intervention.

[0008] The purpose of this invention is to address the core problems of existing 3D hair reconstruction methods, such as sparse Gaussian ellipsoid distribution under non-ideal lighting conditions, unreliable radiation pattern supervision, and mutual interference between geometry and texture optimization. This invention proposes a 3D hair reconstruction method that is robust to complex lighting and features an adaptive Gaussian ellipsoid. This method extends multi-view hair reconstruction capabilities from ideal, controlled lighting environments to real, complex lighting scenes, and from simple hairstyles to complex curly hair, achieving high-precision and high-efficiency hair-level 3D hair reconstruction. This is achieved by constructing an adaptive lighting preprocessing workflow, introducing a visual language model to intelligently guide three-stage density control, and designing a decoupled geometry-texture optimization strategy.

[0009] (II) To achieve the above objectives, the present invention adopts the following technical solution: A method for 3D hair reconstruction that is robust to adaptive Gaussian ellipsoids and complex lighting includes the following steps: S1. Preprocess the input multi-view video sequence, including frame extraction and quality screening, camera parameter initialization, hair segmentation mask generation and hair direction map calculation, to obtain a standardized training view set; S2. Based on low-light enhancement, histogram equalization, and image quality scoring, an illumination-adaptive preprocessing workflow is constructed to enhance the uneven illumination viewpoint and select 64-256 high-quality training frames; preferably 128 high-quality training frames are selected. S3. Introduce a visual language model, wherein the visual language model is preferably Qianwen 2.5-VL to evaluate the illumination level and hairstyle complexity of the input image, automatically locate the worst viewpoint and dynamically determine the upper limit of Gaussian ellipsoid density. S4. Perform three-stage adaptive density control on unstructured 3D Gaussian primitives: the first stage is to globally densify the unreconstructed space, the second stage is to locally densify the matched Gaussian ellipsoid, and the third stage is to selectively densify the hair region from the worst quality viewpoint. S5. Use the hair prior variational autoencoder (VAE) model based on diffusion model analysis (EDM) to constrain the geometric distribution of hair strands, and expand sparse guided hair strands into dense rendered hair strands through K-nearest neighbor weighted 3D coordinate interpolation. S6. Perform two-stage geometry optimization: the coarse optimization stage jointly optimizes the hair geometry and color on the illumination-enhanced image, and the fine optimization stage refines the explicit hair node coordinates and constrains the hair outline through edge loss. S7. Perform decoupled geometry-texture optimization: With the geometry parameters fixed, only the spherical harmonic function color coefficients are optimized on the original image to eliminate the color deviation introduced by the lighting enhancement and output the final 3D hair model.

[0010] Preferably, the preprocessing in S1 mainly includes the following steps: S101. Extract video frames at a rate of 3 frames per second (FPS), use an image quality assessment network to perform no-reference quality scoring on candidate frames within each 1 / 3 second, and select the frame with the highest score in each interval. S102. The COLMAP motion structure recovery algorithm is used to extract features and perform incremental reconstruction on the selected images. At the same time, the sparse point cloud and the intrinsic and extrinsic parameters of each camera are estimated. In the first stage, a learnable camera parameterization strategy is introduced, and the residual refinement is performed with the SfM result as the initial value. S103. An image matting system based on prompt words is adopted. The prompt words include three types of prompt words: "hair", "face" and "human". Corresponding hair, face and human body masks are generated respectively, and low-quality frames with the intersection area of hair and face exceeding 10% of the human body mask area are filtered out. S104. Calculate the 2D hair orientation map of the training image using a Gabor filter bank. Set the Gabor filter bank at pixel p in the orientation... The response on is r n Then the orientation pattern value of that pixel is:

[0011] In the formula, Represents pixels The predicted hair direction angle at that location, To represent the argument operation of complex numbers, Indicates direction index ( ), This indicates the total number of directions of the Gabor filter. Represents pixels First Gabor filter response in each direction, Indicates the first Each sampling direction angle, The term is a complex exponential term, and its introduction makes the direction estimation have... Periodicity, to match the inherent ambiguity of hair direction.

[0012] Preferably, the illumination adaptive preprocessing procedure in S2 includes: S201, HVI Low Light Enhancement: Converts the input RGB image to the HVI (Horizontal / Vertical-Intensity) color space, separating the luminance channel. With color channels, through a learnable gain network Adaptive enhancement of the luminance channel:

[0013] In the formula, This indicates the luminance channel after adaptive enhancement. This represents a learnable gain network. Represents the brightness channel of the original image. This represents the learnable parameters of the network; the enhancement results are then converted back to the RGB space, effectively preserving color information while improving details in dark areas. S202, Histogram Equalization: Adaptive Local Histogram Equalization (CLAHE) is performed on the HVI-enhanced image to further stretch the image contrast and improve the problem of loss of detail caused by local overexposure or underexposure. S203, Q-Align Quality Score: The enhanced image is fed into the Q-Align image quality assessment model, and a comprehensive quality score is calculated for each frame.

[0014] In the formula, Indicates the first The overall quality score of the frame image. This represents the Q-Align image quality assessment model. Indicates the first Frame images; after sorting by score from high to low, 128 high-quality views are selected using a uniform interval strategy for subsequent reconstruction processes.

[0015] Preferably, the illumination evaluation and complexity evaluation based on the visual language model in S3 include: S301, Illumination Level Assessment: Design prompts containing 18 levels of illumination quality scale. Input multi-view images frame by frame into Qianwen 2.5-VL to obtain the illumination quality score for each frame. The lowest quality perspective with the lowest positioning score:

[0016] In the formula, This indicates the worst possible perspective. This represents the index that minimizes the objective function. , Representing the visual language model for the first Illumination quality score of frame image; S302. Hairstyle Complexity Assessment: Design prompts containing an 8-level hairstyle complexity scale. Send multi-view images into Qianwen 2.5-VL for overall hairstyle analysis, and automatically map the corresponding upper limit of Gaussian ellipsoid density based on the complexity rating. The higher the complexity, the better. The larger the size, the better, to accommodate the different geometric details required by various hairstyles.

[0017] Preferably, the three-stage adaptive density control strategy in S4 includes: S401, First stage (iteration steps 500 to 15000): Adopt the standard 3DGS adaptive density control strategy, perform cloning or splitting operations on Gaussian ellipsoids whose gradient in the view space exceeds the threshold, prioritize filling the geometric void areas in the scene that have not been reconstructed, and perform opacity pruning every 3000 steps to remove redundant primitives. S402, Second stage (iteration steps 15000 to 30000): Pause the global pruning operation, and perform local copying and splitting on the Gaussian ellipsoid that has matched the hair every 100 steps. Improve the geometric details by refining the Gaussian distribution of the matched region, while avoiding excessive densification that introduces noise. S403, Phase Three: In From a specific perspective, targeted densification is performed on the region within the hair segmentation mask, supplementing the number of Gaussian ellipsoids in the hair region at that perspective to the density limit. Furthermore, the orientation rationality of newly added primitives is constrained by directional loss.

[0018] As a preferred embodiment, the hair VAE prior and guided hair upsampling described in S5 are as follows: S501, Hair Wire Parametricization: The scalp UV texture parametricization scheme proposed by Neural Strands is adopted to encode the hairstyle as a latent geometric texture. Hair decoder Employing a modulated SIREN architecture to normalize the arc length parameter Using the input as input, the 3D node coordinate sequence is recovered through cumulative integration:

[0019] In the formula, Indicates the number of hair strands The three-dimensional coordinates of each node, Indicates the coordinates of the hair root node. Indicates the first The directional offset vector of a hair segment. Indicates the node index. This represents the total number of nodes in each hair strand; S502, Curvature Enhancement VAE Training: Define the curvature of the l-th hair segment as the magnitude of the cross product of adjacent direction vectors:

[0020] In the formula, Indicates the first The curvature value of a hair segment, Indicates the first The unit direction vector of a hair segment. Indicates the first The unit direction vector of a hair segment. Represents the cross product of vectors. The L2 norm of the vector; the loss of a complete VAE data item is:

[0021] In the formula, This indicates the loss of VAE data items. This indicates the VAE reconstruction of the first Node coordinates Indicates the true number Node coordinates The first sign of reconstruction Segment unit direction vector, Representing the true first The first sign of reconstruction Representing the true first and These represent the weighting coefficients for orientation loss and curvature loss, respectively. S503, K-nearest neighbor-guided hair upsampling: For target hair Based on the UV coordinates of its scalp root Searching for leads in the 1000 decoded guide hairs using a KD-tree Nearest neighbor, weighted by the inverse distance of UV coordinates:

[0022] In the formula, Indicates the first The interpolation weights of the nearest neighbor guide hairs, Indicates the target hair strand and the first The nearest neighbor distance in the scalp UV coordinate space. Indicates the target hair strand UV coordinates of the scalp root Indicates the first The UV coordinates of the scalp root of the nearest neighbor guide hair. This represents the number of nearest neighbors. Indicates Euclidean distance; The coordinates of each guiding hair node are weighted and mixed in three-dimensional coordinate space to generate the target hair node sequence:

[0023] in, Indicates the target hair strand The Node coordinates Indicates the first The interpolation weights of the nearest neighbors, Indicates the first The first nearest neighbor guide hair Node coordinates This represents the number of nearest neighbors. This represents the total number of nodes in each hair strand; This module expands 1,000 sparse guide hairs into 10,000 dense rendering hairs, increasing rendering coverage by approximately 10 times.

[0024] Preferably, the two-stage geometry optimization in S6 specifically includes: S601, Coarse Optimization Stage: This stage involves refining potential hairstyle images. As an optimization variable, 1000 guide hair strands are generated by the VAE decoder, and then expanded to 10000 densely rendered hair strands by the guide hair strand upsampling module; the optimization process consists of 20000 steps, with the first 15000 steps using a BARF strategy to jointly optimize camera parameters and latent hair encoding, and the camera frozen in the last 5000 steps; the learning rate is set to ; S602, Fine Optimization Stage: The 3D node coordinates of the hair strands obtained from coarse optimization are transformed into explicit optimization variables for direct gradient updates, performing a total of 10,000 fine optimization steps and 5,000 edge-specific refinement steps; during the edge-specific refinement stage, parameters other than the Gaussian ellipsoid shape parameters and scaling factors are frozen. With rotation quaternions Apart from the parameters, the accuracy of the hairstyle outline is constrained only by edge loss.

[0025] Preferably, the decoupled geometry-texture optimization strategy in S7 specifically includes: S701, Geometry-Color Joint Optimization Stage: Using the illumination-enhanced image as the supervised target, the geometric parameters of the Gaussian ellipsoid are optimized simultaneously, including position. Scaling Rotation Opacity Color coefficients of spherical harmonic functions Accurate hair geometry is obtained under stable lighting reference; S702, Color Fine-tuning Stage: Freeze all geometric parameters, switch to the original unenhanced image as the supervised target, optimize only the spherical harmonic function color coefficient f, eliminate the color deviation introduced during the illumination enhancement process, and restore the color and lighting effects consistent with the real scene.

[0026] As a preferred approach, the loss functions used in each stage are as follows: S901, Color Loss: Consists of a linear combination of L1 pixel-by-pixel error and SSIM structural similarity loss.

[0027] In the formula, Indicates color loss. Indicates the color of the rendered image. Represents the true color of the image. express Norm, Indicates the SSIM loss weights. Represents the structural similarity loss function; S902, Segmentation Loss: L1 loss is used to simultaneously constrain the rendering of the silhouette and hair tag mask.

[0028] In the formula, Indicates the partition loss. Represents pixels The rendered silhouette value at that location, Represents pixels Hair tag mask at the location, Represents pixels The actual hair segmentation mask at the location; S903, Directional Loss: Design for fuzziness-robust confidence-weighted directional loss:

[0029] In the formula, Indicates directional loss. Represents pixels within the hair area. This represents the hair segmentation mask area. Represents pixels The learnable confidence level at the location, Represents the angle difference function. Indicates the actual direction angle of the hair strands. Indicates the direction angle of the rendered hair strands. Indicating direction Periodic ambiguity; S904, Perceptual Loss: Extract features from the intermediate layers of the pre-trained VGG network and calculate the L2 distance in the feature space:

[0030] In the formula, Indicates perceived loss. This represents the feature extraction function of the intermediate layer in the pre-trained VGG network. Indicates the rendered image, Represents a real image. This represents the total number of elements in the feature map. Represents the square of the characteristic space distance; S905, Edge Loss: Calculate the Canny edge maps for both the rendered image and the ground image, and calculate the L1 loss between the edge maps:

[0031] In the formula, Indicates marginal loss. Indicates the total number of pixels. Indicates the position of the rendered image Canny edge response at the location, This represents the Canny edge response of the real image at the corresponding location; the total loss is the weighted sum of the above terms:

[0032] In the formula, Represents the total loss function. Indicates color loss, Indicates the loss of segmentation, Indicates directional loss, Indicates perceived loss, Indicates edge loss, , , and These represent the weighting coefficients of each loss term; Compared to ideal lighting scenarios, the SDS diffusion loss is removed in non-ideal lighting scenarios because the diffusion prior is trained on USC-HairSalon synthetic data, which has a distribution deviation from the real scene hair geometry. The SDS gradient will guide the reconstruction to shift towards the synthetic distribution, reducing the accuracy of the real geometry.

[0033] Preferably, the method further includes a hair post-processing step: performing root penetration detection on the reconstructed hair strands to remove abnormal hair strand nodes that penetrate the scalp mesh; performing smoothing filtering on the hair strand node coordinates to remove geometric noise; and exporting the hair strands into a standard format that can be used directly in mainstream Unreal Engine graphics engines, supporting editing, relighting, and physical dynamic simulation.

[0034] (III) The beneficial effects of the present invention include the following points: (1) This invention introduces a complete multi-view Figure 3 The proposed hair reconstruction method achieves a reconstruction speed improvement of more than ten times compared to existing technologies under ideal lighting conditions. On real-world scene datasets, it outperforms state-of-the-art methods in terms of PSNR (Peak Signal-to-Noise Ratio), SSIM (Structural Similarity), and LPIPS (Perceptual Image Quality).

[0035] (2) This invention introduces a visual language model into the three-dimensional hair reconstruction density control process for the first time. It achieves automated perception and reasoning of lighting quality and hairstyle complexity through natural language prompts, and can adaptively handle different shooting conditions and hairstyle types without human intervention.

[0036] (3) The three-stage adaptive density control strategy proposed in this invention effectively solves the problem of sparsity of Gaussian ellipsoids under non-ideal lighting, and supplements geometric information in a targeted manner at the worst quality viewpoint, thus ensuring the quality of the supervision signal in the subsequent hair fitting stage.

[0037] (4) The decoupled geometry-texture optimization strategy designed in this invention effectively eliminates the color deviation introduced by the lighting enhancement, and achieves high-fidelity restoration of the real hair color while ensuring geometric accuracy. The output hair model can be directly used for editing, rendering and physical dynamic simulation of mainstream graphics engines such as Unreal Engine.

[0038] (5) Confidence weighting introduced in this invention The fuzzy robust orientation loss and K-nearest neighbor 3D coordinate interpolation guided hair upsampling module improve the quality of the 3D orientation field and effectively overcome the problem of insufficient photometric constraint coverage caused by sparse guided hair. Attached Figure Description

[0039] Figure 1This is a schematic diagram of the overall process of the adaptive Gaussian ellipsoid and complex lighting robust three-dimensional hair reconstruction method proposed in this invention; Figure 2 The qualitative comparison results between this invention and Gaussian Haircut on a self-built real-world dataset include schematic diagrams of the reconstruction effects of three typical hairstyles: straight hair, curly hair, and short hair. Figure 3 The qualitative comparison results between this invention and Gaussian Haircut on the Neural Haircut public dataset include schematic diagrams of the reconstruction effects of three typical hairstyles: straight hair, curly hair, and short hair. Figure 4 The results show a qualitative comparison between the present invention and existing technologies on a self-built real-world dataset, including schematic diagrams of the reconstruction effects of three typical hairstyles: straight hair, curly hair, and short hair. Figure 5 The results of the qualitative comparison between the present invention and existing technologies on the Neural Haircut public dataset include schematic diagrams of the reconstruction effects of three typical hairstyles: straight hair, curly hair, and short hair. Figure 6 This is a comparison chart showing the rendering quality of the present invention and Gaussian Haircut; Figure 7 This is a schematic diagram of the geometric reconstruction ablation experiment results of the present invention, showing the influence of various ablation conditions, such as removing illumination enhancement and density control, and removing edge loss, on the reconstruction quality. Figure 8 This is a schematic diagram of the texture restoration ablation experiment results of the present invention, illustrating the impact of texture decoupling on reconstruction quality. Detailed Implementation

[0040] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0041] Example 1: Please refer to Figure 1 This invention proposes a three-dimensional hair reconstruction method robust to adaptive Gaussian ellipsoids and complex lighting, comprising the following steps: S1. Preprocess the input multi-view video sequence, including frame extraction and quality screening, camera parameter initialization, hair segmentation mask generation and hair direction map calculation, to obtain a standardized training view set; S2. An illumination-adaptive preprocessing workflow is constructed based on HVI low-light enhancement, histogram equalization, and Q-Align image quality scoring to enhance the non-uniform illumination viewpoint and select 128 high-quality training frames. S3. Introduce the visual language model Qianwen 2.5-VL to evaluate the illumination level and hairstyle complexity of the input image, automatically locate the worst viewpoint and dynamically determine the upper limit of Gaussian ellipsoid density. S4. Perform three-stage adaptive density control on unstructured 3D Gaussian primitives: the first stage is to globally densify the unreconstructed space, the second stage is to locally densify the matched Gaussian ellipsoid, and the third stage is to selectively densify the hair region from the worst quality viewpoint. S5. Use the hair prior variational autoencoder (VAE) model based on the diffusion model (EDM) to constrain the hair geometric distribution, and expand the sparse guided hair into dense rendered hair through K-nearest neighbor weighted 3D coordinate interpolation. S6. Perform two-stage geometry optimization: the coarse optimization stage jointly optimizes the hair geometry and color on the illumination-enhanced image, and the fine optimization stage refines the explicit hair node coordinates and constrains the hair outline through edge loss. S7. Perform decoupled geometry-texture optimization: With the geometry parameters fixed, only the spherical harmonic function color coefficients are optimized on the original image to eliminate the color deviation introduced by the lighting enhancement and output the final 3D hair model.

[0042] The preprocessing described in S1 mainly includes the following steps: S101. Extract video frames at a rate of 3 frames per second (FPS), and use an existing image quality assessment network to perform a no-reference quality score on the candidate frames within each 1 / 3 second, and select the frame with the highest score in each interval. S102. The COLMAP motion structure recovery algorithm is used to extract features and perform incremental reconstruction on the selected images. At the same time, the sparse point cloud and the intrinsic and extrinsic parameters of each camera are estimated. In the first stage, a learnable camera parameterization strategy is introduced, and the residual refinement is performed with the SfM result as the initial value. S103. An image matting system based on prompt words is adopted. Hair, face and human body masks are generated by three types of prompt words: “hair”, “face” and “human”, respectively. Low-quality frames with the intersection area of hair and face exceeding 10% of the human body mask area are filtered out. S104. Calculate the 2D hair orientation map of the training image using a Gabor filter bank. Let the Gabor filter bank at pixel p be in the orientation... The response on is r n Then the orientation pattern value of that pixel is:

[0043] in, Represents pixels The predicted hair direction angle at that location, To represent the argument operation of complex numbers, Indicates direction index ( ), This indicates the total number of directions of the Gabor filter. Represents pixels First Gabor filter response in each direction, Indicates the first Each sampling direction angle, The term is a complex exponential term, and its introduction makes the direction estimation have... Periodicity, to match the inherent ambiguity of hair direction.

[0044] The specific details of the adaptive lighting preprocessing procedure described in S2 are as follows: S201, HVI Low Light Enhancement: Converts the input RGB image to the HVI (Horizontal / Vertical-Intensity) color space, separating the luminance channel. With color channels, through a learnable gain network Adaptive enhancement of the luminance channel:

[0045] in, This indicates the luminance channel after adaptive enhancement. This represents a learnable gain network. Represents the brightness channel of the original image. This represents the learnable parameters of the network; the enhancement results are then converted back to the RGB space, effectively preserving color information while improving details in dark areas. S202, Histogram Equalization: Adaptive Local Histogram Equalization (CLAHE) is performed on the HVI-enhanced image to further stretch the image contrast and improve the problem of loss of detail caused by local overexposure or underexposure. S203, Q-Align Quality Score: The enhanced image is fed into the Q-Align image quality assessment model, and a comprehensive quality score is calculated for each frame.

[0046] in, Indicates the first The overall quality score of the frame image. This represents the Q-Align image quality assessment model. Indicates the first Frame images; after sorting by score from high to low, 128 high-quality views are selected using a uniform interval strategy for subsequent reconstruction processes; Specifically, the illumination evaluation and complexity evaluation based on the visual language model described in S3 are as follows: S301, Illumination Level Assessment: Design prompts containing 18 levels of illumination quality scale. Input multi-view images frame by frame into Qianwen 2.5-VL to obtain the illumination quality score for each frame. The lowest quality perspective with the lowest positioning score:

[0047] in, This indicates the worst possible perspective. This represents the index that minimizes the objective function. , Representing the visual language model for the first Illumination quality score of frame image; S302. Hairstyle Complexity Assessment: Design prompts containing an 8-level hairstyle complexity scale. Send multi-view images into Qianwen 2.5-VL for overall hairstyle analysis, and automatically map the corresponding upper limit of Gaussian ellipsoid density based on the complexity rating. The higher the complexity, the better. The larger the size, the better to accommodate the different geometric details required by various hairstyles; The three-stage adaptive density control strategy described in S4 is as follows: S401, First stage (iteration steps 500 to 15000): Adopt the standard 3DGS adaptive density control strategy, perform cloning or splitting operations on Gaussian ellipsoids whose gradient in the view space exceeds the threshold, prioritize filling the geometric void areas in the scene that have not been reconstructed, and perform opacity pruning every 3000 steps to remove redundant primitives. S402, Second stage (iteration steps 15000 to 30000): Pause the global pruning operation, and perform local copying and splitting on the Gaussian ellipsoid that has matched the hair every 100 steps. Improve the geometric details by refining the Gaussian distribution of the matched region, while avoiding excessive densification that introduces noise. S403, Phase Three: In From a specific perspective, targeted densification is performed on the region within the hair segmentation mask, supplementing the number of Gaussian ellipsoids in the hair region at that perspective to the density limit. Furthermore, the orientation rationality of newly added primitives is constrained by directional loss.

[0048] Specifically, the hair VAE prior and guided hair upsampling described in S5 are as follows: S501, Hair Wire Parametricization: The scalp UV texture parametricization scheme proposed by Neural Strands is adopted to encode the hairstyle as a latent geometric texture. Hair decoder Employing a modulated SIREN architecture to normalize the arc length parameter Using the input as input, the 3D node coordinate sequence is recovered through cumulative integration:

[0049] in, Indicates the number of hair strands The three-dimensional coordinates of each node, Indicates the coordinates of the hair root node. Indicates the first The directional offset vector of a hair segment. Indicates the node index. This represents the total number of nodes in each hair strand; S502, Curvature Enhancement VAE Training: Define the curvature of the l-th hair segment as the magnitude of the cross product of adjacent direction vectors:

[0050] in, Indicates the first The curvature value of a hair segment, Indicates the first The unit direction vector of a hair segment. Indicates the first The unit direction vector of a hair segment. Represents the cross product of vectors. The L2 norm of the vector; the loss of a complete VAE data item is:

[0051] In the formula, This indicates the loss of VAE data items. This indicates the VAE reconstruction of the first Node coordinates Indicates the true number Node coordinates The first sign of reconstruction Segment unit direction vector, Representing the true first The first sign of reconstruction Representing the true first and These represent the weighting coefficients for orientation loss and curvature loss, respectively. S503, K-nearest neighbor-guided hair upsampling: For target hair Based on the UV coordinates of its scalp root Searching for leads in the 1000 decoded guide hairs using a KD-tree Nearest neighbor, weighted by the inverse distance of UV coordinates:

[0052] in, Indicates the first The interpolation weights of the nearest neighbor guide hairs, Indicates the target hair strand and the first The nearest neighbor distance in the scalp UV coordinate space. Indicates the target hair strand UV coordinates of the scalp root Indicates the first The UV coordinates of the scalp root of the nearest neighbor guide hair. This represents the number of nearest neighbors. Indicates Euclidean distance; The coordinates of each guiding hair node are weighted and mixed in three-dimensional coordinate space to generate the target hair node sequence:

[0053] in, Indicates the target hair strand The Node coordinates Indicates the first The interpolation weights of the nearest neighbors, Indicates the first The first nearest neighbor guide hair Node coordinates This represents the number of nearest neighbors. This represents the total number of nodes in each hair strand; This module expands 1,000 sparse guide hairs into 10,000 dense rendering hairs, increasing rendering coverage by approximately 10 times.

[0054] The two-stage geometry optimization described in S6 is as follows: S601, Coarse Optimization Stage: This stage involves refining potential hairstyle images. As an optimization variable, 1000 guide hair strands are generated by the VAE decoder, and then expanded to 10000 densely rendered hair strands by the guide hair strand upsampling module; the optimization process consists of 20000 steps, with the first 15000 steps using a BARF strategy to jointly optimize camera parameters and latent hair encoding, and the camera frozen in the last 5000 steps; the learning rate is set to ; S602, Fine Optimization Stage: The 3D node coordinates of the hair strands obtained from coarse optimization are transformed into explicit optimization variables for direct gradient updates, performing a total of 10,000 fine optimization steps and 5,000 edge-specific refinement steps; during the edge-specific refinement stage, parameters other than the Gaussian ellipsoid shape parameters (scaling factor) are frozen. With rotation quaternions All parameters other than those used are constrained only by edge loss to ensure the accuracy of the hairstyle outline.

[0055] The decoupling geometry-texture optimization strategy described in S7 is as follows: S701, Geometric-Color Joint Optimization Stage: Using the illumination-enhanced image as the supervised target, the geometric parameters (position) of the Gaussian ellipsoid are optimized simultaneously. Scaling Rotation Opacity ) and spherical harmonic color coefficient Accurate hair geometry is obtained under stable lighting reference; S702, Color Fine-tuning Stage: Freeze all geometric parameters, switch to the original unenhanced image as the supervised target, optimize only the spherical harmonic function color coefficient f, eliminate the color deviation introduced during the illumination enhancement process, and restore the color and lighting effects consistent with the real scene.

[0056] The loss functions used in each stage are as follows: S901, Color Loss: Consists of a linear combination of L1 pixel-by-pixel error and SSIM structural similarity loss.

[0057] in, Indicates color loss. Indicates the color of the rendered image. Represents the true color of the image. express Norm, Indicates the SSIM loss weights. Represents the structural similarity loss function; S902, Segmentation Loss: L1 loss is used to simultaneously constrain the rendering of the silhouette and hair tag mask.

[0058] in, Indicates the partition loss. Represents pixels The rendered silhouette value at that location, Represents pixels Hair tag mask at the location, Represents pixels The actual hair segmentation mask at the location; S903, Directional Loss: Design for fuzziness-robust confidence-weighted directional loss:

[0059] in, Indicates directional loss. Represents pixels within the hair area. This represents the hair segmentation mask area. Represents pixels The learnable confidence level at the location, Represents the angle difference function. Indicates the actual direction angle of the hair strands. Indicates the direction angle of the rendered hair strands. Indicating direction Periodic ambiguity; S904, Perceptual Loss: Extract features from the intermediate layers of the pre-trained VGG network and calculate the L2 distance in the feature space:

[0060] in, Indicates perceived loss. This represents the feature extraction function of the intermediate layer in the pre-trained VGG network. Indicates the rendered image, Represents a real image. This represents the total number of elements in the feature map. Represents the square of the characteristic space distance; S905, Edge Loss: Calculate the Canny edge maps for both the rendered image and the ground image, and calculate the L1 loss between the edge maps:

[0061] In the formula, Indicates marginal loss. Indicates the total number of pixels. Indicates the position of the rendered image Canny edge response at the location, This represents the Canny edge response of the real image at the corresponding location; the total loss is the weighted sum of the above terms:

[0062] In the formula, Represents the total loss function. Indicates color loss, Indicates the loss of segmentation, Indicates directional loss, Indicates perceived loss, Indicates edge loss, , , and These represent the weighting coefficients of each loss term; Compared to ideal lighting scenes, the SDS diffusion loss is removed in non-ideal lighting scenes because the diffusion prior is trained on USC-HairSalon synthetic data, which has a distribution deviation from the real-world hair geometry. The SDS gradient will guide the reconstruction to shift towards the synthetic distribution, reducing the accuracy of the real geometry. The method further includes a hair post-processing step: performing root penetration detection on the reconstructed hair strands to remove abnormal hair strand nodes that penetrate the scalp mesh; performing smoothing filtering on the hair strand node coordinates to remove geometric noise; and exporting the hair strands into a standard format that can be directly used in mainstream graphics engines such as Unreal Engine, supporting editing, relighting, and physical dynamic simulation.

[0063] Example 2: Please refer to Figures 2 to 6 Based on the method described in Example 1, the specific implementation process is as follows: (I) Dataset and Experiment Setup This invention was validated on a self-built real-scene dataset and the publicly available Neural Haircut dataset. The self-built dataset contains three complex hairstyles (straight, curly, and short hair), and multi-view videos were captured using a smartphone under various indoor and outdoor lighting conditions, including non-ideal lighting scenarios such as low light, overexposure, and backlighting. The Neural Haircut publicly available dataset contains multi-view image sequences of six hairstyles and provides a standard evaluation protocol. Evaluation metrics include PSNR (Peak Signal-to-Noise Ratio), SSIM (Structural Similarity), and LPIPS (Perceptual Image Quality). Due to the lack of geometric ground truth in both the real-scene dataset and the Neural Haircut publicly available dataset, only quantitative evaluation of rendering quality was performed.

[0064] (II) Qualitative Results This invention can accurately reconstruct anisotropic highlights and self-occluding shadows in straight hair scenarios; in curly hair scenarios, by increasing the curvature loss weight and improving the density upper limit guided by VLM, it successfully captures spiral curvature geometry, significantly outperforming comparison methods dominated by fixed priors; in short hair scenarios, edge loss effectively constrains the accuracy of hair tip contours, achieving clear hairline boundary reconstruction. Quantitative comparison results are as follows: Figures 2 to 6 As shown.

[0065] Table 1 is a statistical table comparing the quantitative results of existing technologies and the proposed solution;

[0066] (III) Quantitative Results Table 1 lists the quantitative results comparing the rendering quality of this invention with those of Gaussian Haircut. Compared to Gaussian Haircut, this invention improves PSNR by 1.02, SSIM by 0.0080, and reduces LPIPS by 0.0058 on real-world scene datasets; on the NeuralHaircut public dataset, it improves PSNR by 0.33, SSIM by 0.0043, and reduces LPIPS by 0.0138. Compared to existing technologies (single-view methods based on Transformer priors and Gaussian sputtering) and (single-view methods based on diffusion Transformer), this invention has significant advantages in the effective utilization of multi-view inputs and robustness to non-ideal lighting.

[0067] (iv) Ablation test The contributions of each key module were verified through three sets of ablation experiments: After removing illumination enhancement and VLM density control, obvious geometric holes appeared in backlit and low-light scenes, resulting in a significant decrease in rendering quality; after removing edge loss, jagged and blurred hair outlines appeared; after removing the geometry-texture decoupling strategy, the reconstructed hair color showed a significant shift and did not match the real hair color. The results of the ablation experiments are as follows: Figures 7 to 8 As shown.

[0068] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and its improved concept, should be covered within the scope of protection of the present invention.

Claims

1. A method for three-dimensional hair reconstruction robust to adaptive Gaussian ellipsoids and complex lighting, characterized in that, Includes the following steps: S1. Preprocess the input multi-view video sequence, including frame extraction and quality screening, camera parameter initialization, hair segmentation mask generation and hair direction map calculation, to obtain a standardized training view set; S2. An illumination-adaptive preprocessing workflow is constructed based on low-light enhancement, histogram equalization, and image quality scoring to enhance the uneven illumination viewpoint and select 64-256 high-quality training frames. S3. Introduce a visual language model to evaluate the illumination level and hairstyle complexity of the input image, automatically locate the worst viewpoint and dynamically determine the upper limit of Gaussian ellipsoid density. S4. Perform three-stage adaptive density control on unstructured 3D Gaussian primitives: the first stage is to globally densify the unreconstructed space, the second stage is to locally densify the matched Gaussian ellipsoid, and the third stage is to selectively densify the hair region from the worst quality viewpoint. S5. Use the hair prior variational autoencoder model based on diffusion model analysis to constrain the geometric distribution of hair strands, and expand sparse guided hair strands into dense rendered hair strands through K-nearest neighbor weighted 3D coordinate interpolation. S6. Perform two-stage geometry optimization: the coarse optimization stage jointly optimizes the hair geometry and color on the illumination-enhanced image, and the fine optimization stage refines the explicit hair node coordinates and constrains the hair outline through edge loss. S7. Perform decoupled geometry-texture optimization: With the geometry parameters fixed, only the spherical harmonic function color coefficients are optimized on the original image to eliminate the color deviation introduced by the lighting enhancement and output the final 3D hair model.

2. The method for three-dimensional hair reconstruction robust to adaptive Gaussian ellipsoids and complex lighting as described in claim 1, characterized in that, The preprocessing process in S1 mainly includes the following steps: S101. Extract video frames at a frame rate of 3, use an image quality assessment network to perform no-reference quality scoring on candidate frames within each 1 / 3 second, and select the frame with the highest score in each interval. S102. The COLMAP motion structure recovery algorithm is used to extract features and reconstruct incrementally in the selected images. At the same time, the sparse point cloud and the intrinsic and extrinsic parameters of each camera are estimated. In the first stage, a learnable camera parameterization strategy is introduced, and the residual is refined using the SfM result as the initial value. S103. An image matting system based on prompt words is adopted. The prompt words include three types of prompt words: "hair", "face" and "human". Corresponding hair, face and human body masks are generated respectively, and low-quality frames with the intersection area of hair and face exceeding 10% of the human body mask area are filtered out. S104. Calculate the 2D hair orientation map of the training image using a Gabor filter bank. Set the Gabor filter bank at pixel p in the orientation... The response on is r n Then the orientation pattern value of that pixel is: In the formula, Represents pixels The predicted hair direction angle at that location, To represent the argument operation of complex numbers, Indicates direction index ( ), This indicates the total number of directions of the Gabor filter. Represents pixels First Gabor filter response in each direction Indicates the first Each sampling direction angle, The term is a complex exponential term, and its introduction makes the direction estimation have... Periodicity, to match the inherent ambiguity of hair direction.

3. The method for three-dimensional hair reconstruction robust to adaptive Gaussian ellipsoids and complex lighting as described in claim 1, characterized in that, The illumination adaptive preprocessing procedure in S2 includes: S201, HVI Low Light Enhancement: Converts the input RGB image to the HVI (Horizontal / Vertical-Intensity) color space, separating the luminance channel. With color channels, through a learnable gain network Adaptive enhancement of the luminance channel: In the formula, This indicates the luminance channel after adaptive enhancement. This represents a learnable gain network. Represents the brightness channel of the original image. This represents the learnable parameters of the network; the enhancement results are then converted back to the RGB space, effectively preserving color information while improving details in dark areas. S202, Histogram Equalization: Adaptive Local Histogram Equalization (CLAHE) is performed on the HVI-enhanced image to further stretch the image contrast and improve the problem of loss of detail caused by local overexposure or underexposure. S203, Q-Align Quality Score: The enhanced image is fed into the Q-Align image quality assessment model, and a comprehensive quality score is calculated for each frame. In the formula, Indicates the first The overall quality score of the frame image. This represents the Q-Align image quality assessment model. Indicates the first Frame images; after sorting by score from high to low, 128 high-quality views are selected using a uniform interval strategy for subsequent reconstruction processes.

4. The method for three-dimensional hair reconstruction robust to adaptive Gaussian ellipsoids and complex lighting as described in claim 1, characterized in that, The illumination evaluation and complexity evaluation based on the visual language model in S3 include: S301, Illumination Level Assessment: Design prompts containing 18 levels of illumination quality scale. Input multi-view images frame by frame into Qianwen 2.5-VL to obtain the illumination quality score for each frame. The lowest quality perspective with the lowest positioning score: In the formula, This indicates the worst possible perspective. This represents the index that minimizes the objective function. , Representing the visual language model for the first Illumination quality score of frame image; S302. Hairstyle Complexity Assessment: Design prompts containing an 8-level hairstyle complexity scale. Send multi-view images into Qianwen 2.5-VL for overall hairstyle analysis, and automatically map the corresponding upper limit of Gaussian ellipsoid density based on the complexity rating. The higher the complexity, the better. The larger the size, the better, to accommodate the different geometric details required by various hairstyles.

5. The method for three-dimensional hair reconstruction robust to adaptive Gaussian ellipsoids and complex lighting as described in claim 1, characterized in that, The three-stage adaptive density control strategy in S4 includes: S401, First stage: Adopt the standard 3DGS adaptive density control strategy, perform cloning or splitting operations on Gaussian ellipsoids whose gradient in the view space exceeds the threshold, prioritize filling the geometric hole areas in the scene that have not been reconstructed, and perform opacity pruning every 3000 steps to remove redundant primitives. S402, Second Stage: Pause the global pruning operation, and perform local copying and splitting on the Gaussian ellipsoid that has been matched with the hair every 100 steps. Improve the geometric details by refining the Gaussian distribution of the matched region, while avoiding excessive densification that introduces noise. S403, Phase Three: In From a specific perspective, targeted densification is performed on the region within the hair segmentation mask, supplementing the number of Gaussian ellipsoids in the hair region at that perspective to the density limit. Furthermore, the orientation rationality of newly added primitives is constrained by directional loss.

6. The method for three-dimensional hair reconstruction robust to adaptive Gaussian ellipsoids and complex lighting as described in claim 1, characterized in that, The hair VAE prior and guided hair upsampling described in S5 are as follows: S501, Hair Wire Parametricization: The scalp UV texture parametricization scheme proposed by Neural Strands is adopted to encode the hairstyle as a latent geometric texture. Hair decoder Employing a modulated SIREN architecture to normalize the arc length parameter Using the input as input, the 3D node coordinate sequence is recovered through cumulative integration: In the formula, Indicates the number of hair strands The three-dimensional coordinates of each node, Indicates the coordinates of the hair root node. Indicates the first The directional offset vector of a hair segment. Indicates the node index. This represents the total number of nodes in each hair strand; S502, Curvature Enhancement VAE Training: Define the curvature of the l-th hair segment as the magnitude of the cross product of adjacent direction vectors: In the formula, Indicates the first The curvature value of a hair segment, Indicates the first The unit direction vector of a hair segment. Indicates the first The unit direction vector of a hair segment. Represents the cross product of vectors. The L2 norm of the vector; the loss of a complete VAE data item is: In the formula, This indicates the loss of VAE data items. This indicates the VAE reconstruction of the first Node coordinates Indicates the true number Node coordinates The first sign of reconstruction Segment unit direction vector, Representing the true first Segment unit direction vector, The first sign of reconstruction Segment curvature, Representing the true first Segment curvature, and These represent the weighting coefficients for orientation loss and curvature loss, respectively. S503, K-nearest neighbor-guided hair upsampling: For target hair Based on the UV coordinates of its scalp root Searching for leads in the 1000 decoded guide hairs using a KD-tree Nearest neighbor, weighted by the inverse distance of UV coordinates: In the formula, Indicates the first The interpolation weights of the nearest neighbor guide hairs, Indicates the target hair strand and the first The nearest neighbor distance in the scalp UV coordinate space. Indicates the target hair strand UV coordinates of the scalp root Indicates the first The UV coordinates of the scalp root of the nearest neighbor guide hair. This represents the number of nearest neighbors. Indicates Euclidean distance; The coordinates of each guiding hair node are weighted and mixed in three-dimensional coordinate space to generate the target hair node sequence: in, Indicates the target hair strand The Node coordinates Indicates the first The interpolation weights of the nearest neighbors, Indicates the first The first nearest neighbor guide hair Node coordinates This represents the number of nearest neighbors. This represents the total number of nodes in each hair strand; This module expands 1,000 sparse guide hairs into 10,000 dense rendering hairs, increasing rendering coverage by approximately 10 times.

7. The method for three-dimensional hair reconstruction robust to adaptive Gaussian ellipsoids and complex lighting as described in claim 1, characterized in that, The two-stage geometry optimization in S6 specifically includes: S601, Coarse Optimization Stage: This stage involves refining potential hairstyle images. As an optimization variable, 1000 guide hair strands are generated by the VAE decoder, and then expanded to 10000 densely rendered hair strands by the guide hair strand upsampling module; the optimization process consists of 20000 steps, with the first 15000 steps using a BARF strategy to jointly optimize camera parameters and latent hair encoding, and the camera frozen in the last 5000 steps; the learning rate is set to ; S602, Fine Optimization Stage: The 3D node coordinates of the hair strands obtained from coarse optimization are transformed into explicit optimization variables for direct gradient updates, performing a total of 10,000 fine optimization steps and 5,000 edge-specific refinement steps; during the edge-specific refinement stage, parameters other than the Gaussian ellipsoid shape parameters and scaling factors are frozen. With rotation quaternions Apart from the parameters, the accuracy of the hairstyle outline is constrained only by edge loss.

8. The method for three-dimensional hair reconstruction robust to adaptive Gaussian ellipsoids and complex lighting as described in claim 1, characterized in that, The decoupled geometry-texture optimization strategy in S7 specifically includes: S701, Geometry-Color Joint Optimization Stage: Using the illumination-enhanced image as the supervised target, the geometric parameters of the Gaussian ellipsoid are optimized simultaneously, including position. Scaling Rotation Opacity Color coefficients of spherical harmonic functions Accurate hair geometry is obtained under stable lighting reference; S702, Color Fine-tuning Stage: Freeze all geometric parameters, switch to the original unenhanced image as the supervised target, optimize only the spherical harmonic function color coefficient f, eliminate the color deviation introduced during the illumination enhancement process, and restore the color and lighting effects consistent with the real scene.

9. The method for three-dimensional hair reconstruction robust to adaptive Gaussian ellipsoids and complex lighting as described in claim 1, characterized in that, The loss functions used in each stage are as follows: S901, Color Loss: Consists of a linear combination of L1 pixel-by-pixel error and SSIM structural similarity loss. In the formula, Indicates color loss. Indicates the color of the rendered image. Represents the true color of the image. express Norm, Indicates the SSIM loss weights. Represents the structural similarity loss function; S902, Segmentation Loss: L1 loss is used to simultaneously constrain the rendering of the silhouette and hair tag mask. In the formula, Indicates the partition loss. Represents pixels The rendered silhouette value at that location, Represents pixels Hair tag mask at the location, Represents pixels The actual hair segmentation mask at the location; S903, Directional Loss: Design for fuzziness-robust confidence-weighted directional loss: In the formula, Indicates directional loss. Represents pixels within the hair area. This represents the hair segmentation mask area. Represents pixels The learnable confidence level at the location, Represents the angle difference function. Indicates the actual direction angle of the hair strands. Indicates the direction angle of the rendered hair strands. Indicating direction Periodic ambiguity; S904, Perceptual Loss: Extract features from the intermediate layers of the pre-trained VGG network and calculate the L2 distance in the feature space: In the formula, Indicates perceived loss. This represents the feature extraction function of the intermediate layer in the pre-trained VGG network. Indicates the rendered image, Represents a real image. This represents the total number of elements in the feature map. Represents the square of the characteristic space distance; S905, Edge Loss: Calculate the Canny edge maps for both the rendered image and the ground image, and calculate the L1 loss between the edge maps: In the formula, Indicates marginal loss. Indicates the total number of pixels. Indicates the position of the rendered image Canny edge response at the location, This represents the Canny edge response of the real image at the corresponding location; the total loss is the weighted sum of the above terms: In the formula, Represents the total loss function. Indicates color loss, Indicates the loss of segmentation, Indicates directional loss, Indicates perceived loss, Indicates edge loss, , , and These represent the weighting coefficients of each loss term; Compared to ideal lighting scenarios, the SDS diffusion loss is removed in non-ideal lighting scenarios because the diffusion prior is trained on USC-HairSalon synthetic data, which has a distribution deviation from the real scene hair geometry. The SDS gradient will guide the reconstruction to shift towards the synthetic distribution, reducing the accuracy of the real geometry.

10. A method for three-dimensional hair reconstruction robust to adaptive Gaussian ellipsoids and complex lighting as described in claim 1, characterized in that, The method also includes a hair post-processing step: performing root penetration detection on the reconstructed hair strands to remove abnormal hair strand nodes that penetrate the scalp grid; and performing smoothing filtering on the hair strand node coordinates to remove geometric noise. Export hair strands in a standard format that can be used directly in mainstream graphics engines like Unreal Engine, supporting editing, relighting, and physical dynamic simulation.