A human point cloud data processing method and device

By combining deep linked dynamic networks and probabilistic registration models, the problems of high joint hinges and limb occlusion in human point cloud registration are solved, achieving more accurate joint hinge registration and differentiation of occluded limb parts.

CN116563893BActive Publication Date: 2026-06-30ZHENGZHOU UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ZHENGZHOU UNIV
Filing Date
2023-05-25
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

The problem of poor registration results caused by high joint hinges and limb occlusion in human point cloud registration.

Method used

A method combining deep linked dynamic networks and probabilistic registration models is adopted. By acquiring source point cloud data and standard point cloud data of the human body, the first deformation field and the second deformation field are obtained respectively. After fusion processing, the target point cloud data is determined, while maintaining the local adjacency structure and global topology.

Benefits of technology

It improves the registration effect of joint hinges and can accurately distinguish human body parts when limbs are obscured.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116563893B_ABST
    Figure CN116563893B_ABST
Patent Text Reader

Abstract

This invention provides a method and apparatus for processing human point cloud data. The method includes: acquiring source point cloud data and standard point cloud data of the human body; obtaining a first deformation field and a second deformation field based on the source point cloud data and the standard point cloud data; the first deformation field is obtained through a deep linked dynamic network; the second deformation field is obtained by processing the source point cloud data and the standard point cloud data using a probabilistic registration model; and determining target point cloud data of the human body based on the first deformation field and the second deformation field. This invention's solution can simultaneously maintain local adjacency structure and global topology, thereby improving joint hinge registration performance and enabling the differentiation of human body parts when limbs are occluded.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of computer information processing technology, and in particular to a method and apparatus for processing human point cloud data. Background Technology

[0002] Non-rigid registration of human point clouds is a key problem in computer vision and computer graphics applications, aiming to find the spatial transformation relationship between two human point clouds to unify them into the same coordinate system. It has wide applications in medical image registration, 3D reconstruction, and human pose estimation. Non-rigid human registration is a challenging problem. First, unlike rigid registration, which only requires determining rotation and translation parameters, non-rigid registration requires estimating the unknown motion of all points. Second, human point cloud registration presents two challenges compared to other non-rigid registration methods: high hinge points at joints and limb occlusion, making human registration particularly complex.

[0003] Point cloud registration is a research area that determines the spatial transformation relationship between two or more point sets in space. Non-rigid registration mainly addresses the deformation problems of point clouds, such as scaling and affine transformation. Non-rigid point cloud registration can be divided into traditional optimization-based algorithms and deep learning-based methods.

[0004] Traditional optimization-based algorithms iteratively estimate the correspondences and transformation steps between point clouds. Through continuous iteration, the correspondences become more and more accurate.

[0005] The earliest iterative nonrigid registration methods were based on thin spline transformation and local radiative transformation. NonrigidICP is an improved version of the ICP algorithm, primarily using local affine regularization to determine the optimal deformation field. CoherentPointDrift is a point cloud registration algorithm based on Gaussian mixture models, imposing constraints through regularized deformation fields and deriving the optimal transformation using variational methods. The Global-LocalTopology Preservation algorithm mainly achieves registration by merging two topological constraints into a single probability density estimation framework. GH-GMM combines convex hulls (with a denser source-origin point set) and GMM to reduce computational complexity. JRMPC redefines registration as a cluster problem, where deformation field optimization is achieved through GMM.

[0006] The main idea behind graph optimization-based registration methods is to use nonparametric models to handle point cloud registration problems. CSGM uses linear equations to solve graph matching problems. High-ordergraph uses an integer projection algorithm to optimize the objective function within the certificate domain.

[0007] Deep learning-based registration methods employ deep neural networks to learn robust feature correspondence search, then determine the transformation matrix through one-step estimation (e.g., RANSAC). 3Dmatch trains a parallel network from RGBD images, taking 3D volumetric data as input and outputting 512-dimensional features of local patches; 3Dmatch can extract local features from 3D point clouds. 3DmatchNet introduces a preprocessing method for aligning 3D patches and computes volumetric data based on these alignments. By inputting the aligned volumetric data into a CNN, the extracted features are rotation-invariant. OctNet uses an octree to hierarchically partition the volumetric data into an imbalanced tree, where each leaf node stores a feature representation. DeepGMR defines registration as minimizing the KL divergence of the probability distributions of two Gaussian mixture models, a real-time, noise-resistant, and robust full registration method. However, it is not suitable for handling non-rigid point cloud registration. GPD-Net is a probabilistic unsupervised learning method that first learns a global shape descriptor and connects it to the source point cloud, then uses PointNet to learn the offset of each source point. FlowNet3D uses 3Dsiamese networks to regress the deformation field of two-point clouds for real-time inference, but it lacks robustness under large deformations. PointNet++ is used for feature extraction. Lepard uses Transformer to learn global point-to-point mappings and uses them as landmarks for global non-rigid registration. RMA-Net and NrtNet both use DGCCN and Transformer to extract feature information. DGCNN's EdgeConv representation module dynamically updates the graph structure, allowing information to propagate better between similar structures and improving the learning of local information. Summary of the Invention

[0008] The technical problem to be solved by the present invention is to provide a method and apparatus for processing human point cloud data, which solves the problems of poor registration effect of high joint hinges and difficulty in distinguishing human body parts due to limb occlusion.

[0009] To solve the above-mentioned technical problems, the technical solution of the present invention is as follows:

[0010] A method for processing human body point cloud data, the method comprising:

[0011] Acquire source point cloud data and standard point cloud data of the human body;

[0012] Based on the source point cloud data and the standard point cloud data, a first deformation field and a second deformation field are obtained, respectively; the first deformation field is obtained through a deep linked dynamic network; the second deformation field is obtained by processing the source point cloud data and the standard point cloud data through the probabilistic registration model.

[0013] The target point cloud data of the human body is determined based on the first deformation field and the second deformation field.

[0014] Optionally, based on the source point cloud data and the standard point cloud data, the first deformation field is obtained, including:

[0015] The first feature result is obtained by performing local feature extraction processing on the source point cloud data;

[0016] The second feature result is obtained by performing local feature extraction processing on the standard point cloud data;

[0017] Based on the first feature result and the second feature result, the correlation result between the source point cloud data and the standard point cloud data is determined;

[0018] Based on the correlation results, the first deformation field is determined.

[0019] Optionally, determining the first deformation field based on the correlation result includes:

[0020] The correlation results, the hidden state value output by the previous node, and the geometric feature values ​​extracted from the source point cloud data are input into the gated loop model for processing to obtain the output value.

[0021] The first deformation field is determined based on the output value.

[0022] Optionally, the probability registration model is obtained through the following process:

[0023] Determine the joint probability density function based on preset conditions;

[0024] The values ​​of the latent parameters of the joint probability density function are determined using a target clustering algorithm.

[0025] The probability registration model is determined based on the value of the hidden parameter.

[0026] Optionally, determining the probability registration model based on the values ​​of the latent parameters includes:

[0027] Based on the value of the latent parameter, the posterior probability is obtained;

[0028] The probability registration model is determined based on the posterior probability and the preset constraints.

[0029] Optionally, the source point cloud data and the standard point cloud data are processed using the probability registration model to obtain a second deformation field, including:

[0030] pass The source point cloud data and the standard point cloud data are processed to obtain the second deformation field;

[0031] in, For the second deformation field, p old m|t n Let m be the first posterior probability, m be the number of source point cloud data points, and t be the first posterior probability. n For the nth point in standard point cloud data, s n Let Gm be the nth point in the source point cloud data, · represent the mth row of the Gaussian kernel matrix, W be the weight matrix of the Gaussian kernel, and σ be the weight matrix of the Gaussian kernel. 2 For isotropic variance, N p D is the second posterior probability, lnσ 2 Take the logarithm of the variance for isotropic variance. E is a trade-off parameter for local constraints. global W represents the local constraint, λ represents the trade-off parameter for the global constraint, and E represents the global constraint. local W represents a global constraint.

[0032] Optionally, based on the first deformation field and the second deformation field, the target point cloud data of the human body is determined, including:

[0033] The first deformation field and the second deformation field are fused to obtain the third deformation field;

[0034] Based on the third deformation field, the target point cloud data of the human body is determined.

[0035] Optionally, the method for processing the human body point cloud data further includes:

[0036] The similarity between the target point cloud data and the standard point cloud data is determined by a loss function.

[0037] Optionally, the similarity between the target point cloud data and the standard point cloud data is determined using a loss function, including:

[0038] pass Determine the similarity between the target point cloud data and the standard point cloud data;

[0039] Where Lk is the loss function, Ldepth is the loss function representing the depth map, β1 is the mask loss function trade-off parameter, Lmask is the binary mask loss function, β2 is the edge length loss function trade-off parameter, Larap is the edge length loss function, β3 is the rigid transformation vector loss function trade-off parameter, Ltran is the rigid transformation vector loss function, β4 is the skin weight sparse term weight trade-off parameter, Lsparse is the skin weight sparse term weight loss function, Sk is the target point cloud data obtained in the k-th iteration, T is the target point cloud, and tk is the translation vector in the k-th iteration. Let be the point weight of the k-th rigid transformation in the k-th iteration.

[0040] The present invention also provides a processing device for human body point cloud data, the device comprising:

[0041] The acquisition module is used to acquire source point cloud data and standard point cloud data of the human body;

[0042] The processing module is used to obtain a first deformation field and a second deformation field based on the source point cloud data and the standard point cloud data, respectively; the first deformation field is obtained through a deep linked dynamic network; the second deformation field is obtained by processing the source point cloud data and the standard point cloud data through the probabilistic registration model; and the target point cloud data of the human body is determined based on the first deformation field and the second deformation field.

[0043] The above-described solution of the present invention has at least the following beneficial effects:

[0044] The above-described solution of the present invention acquires source point cloud data and standard point cloud data of the human body; obtains a first deformation field and a second deformation field based on the source point cloud data and the standard point cloud data, respectively; the first deformation field is obtained through a deep linked dynamic network; the second deformation field is obtained by processing the source point cloud data and the standard point cloud data through the probabilistic registration model; and the target point cloud data of the human body is determined based on the first deformation field and the second deformation field. This approach simultaneously maintains local adjacency structure and global topology, thereby improving joint hinge registration and enabling the differentiation of human body parts when limbs are occluded. Attached Figure Description

[0045] Figure 1 This is a flowchart illustrating the method for processing human point cloud data provided in an embodiment of the present invention;

[0046] Figure 2 This is a schematic diagram of the architecture of the human point cloud data processing method according to an embodiment of the present invention;

[0047] Figure 3 This is a schematic diagram of the cyclic update architecture according to an embodiment of the present invention;

[0048] Figure 4 This is a schematic diagram of the module block of the human body point cloud data processing device according to an embodiment of the present invention. Detailed Implementation

[0049] Exemplary embodiments of the present disclosure will now be described in more detail with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention may be implemented in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this invention will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

[0050] The objective of this application is to find a non-rigid deformation field based on given source point cloud data and target point cloud data, so that the source point cloud data, after transformation, is as close as possible to the target point cloud data. This invention designs a learning-based framework that uses source point cloud data and target point cloud data as input to directly predict the non-rigid deformation field.

[0051] This application uses a series of rigid transformation point-to-point combinations To represent non-rigid transformation

[0052]

[0053] in, The total rigid deformation field, i.e., the third deformation field, is represented by S, which is the source point cloud data, and w. r The skin weights for the k-th rigid transformation. Let K be the deformation field of the k-th rigid transformation, where K represents the non-rigid transformation as K rigid transformations, and r is the r-th rigid transformation.

[0054] Specifically, a non-rigid transformation can be represented as a point-direction combination of K rigid transformations, where K is much smaller than the number of surface points. When K = 1, the model degenerates into a rigid transformation. When K ≥ 2, each surface is affected by multiple rigid transformations with different skin weights, and the model can represent a non-rigid transformation. The larger K is, the stronger the algorithm's ability to represent non-rigid transformations. This allows the processing method for the human point cloud data to approach any non-rigid transformation and provides good constraints on the solution space.

[0055] To learn this representation, a recurrent neural network framework can be designed to iteratively estimate the combined weights and each rigid transformation. In each iteration, the network only needs to estimate a single rigid transformation and the skin weights at each point, which can represent the importance of the rigid transformation at that point.

[0056] It should be noted that the rigid transformation in the above equation is a global definition for all points on the surface. Therefore, the proposed model does not need to construct a deformation map for each specific surface. Moreover, the skin weights in the above equation are learned and can be adaptively adjusted according to different source point cloud data surfaces and standard point cloud data surfaces. This not only allows for the representation of complex non-rigid deformations, but can also be extended to rigid registration by removing the weights and replacing the addition in the above equation with multiplication.

[0057] The method and apparatus for processing human point cloud data provided in the embodiments of the present invention will be described in detail below with reference to the accompanying drawings and through specific embodiments and application scenarios.

[0058] like Figure 1 As shown, an embodiment of the present invention provides a method for processing human point cloud data, the method comprising:

[0059] Step 11: Obtain source point cloud data and standard point cloud data of the human body;

[0060] Step 12: Based on the source point cloud data and the standard point cloud data, obtain the first deformation field and the second deformation field respectively; the first deformation field is obtained through a deep linked dynamic network; the second deformation field is obtained by processing the source point cloud data and the standard point cloud data through the probabilistic registration model.

[0061] Step 13: Determine the target point cloud data of the human body based on the first deformation field and the second deformation field.

[0062] In this embodiment of the invention, a first deformation field and a second deformation field are obtained based on the acquired source point cloud data and standard point cloud data of the human body, respectively. The target point cloud data of the human body is then determined based on the first deformation field and the second deformation field. The second deformation field is obtained by processing the source point cloud data and the standard point cloud data using the probabilistic registration model. This approach simultaneously preserves local adjacency structures and global topology, thereby improving joint hinge registration and enabling the differentiation of human body parts when limbs are occluded.

[0063] Wherein, the source point cloud data S = s1,…,s M T Sm∈RD is a template with a sparse distribution, where Sm represents all points in the source point cloud data and RD is a sparse distribution matrix;

[0064] The standard point cloud data T = t1, ...,t N T tn∈RD represents standard point cloud data with a dense distribution, where tn represents all points in the standard point cloud data;

[0065] The second deformation field can be expressed as: Represents a function of S with parameter θ;

[0066] In this context, T may contain outliers. We set a weight w for a uniform component where 0w≤1. This removes outliers from T.

[0067] In an optional embodiment of the present invention, step 12 may include:

[0068] Step 121: Obtain the first feature result by performing local feature extraction processing on the source point cloud data;

[0069] Step 122: Obtain the second feature result by performing local feature extraction processing on the standard point cloud data;

[0070] Step 123: Determine the correlation result between the source point cloud data and the standard point cloud data based on the first feature result and the second feature result;

[0071] Step 124: Determine the first deformation field based on the correlation results.

[0072] Specifically, step 124 may include:

[0073] Step 1241: Input the correlation results, the hidden state value output by the previous node, and the geometric feature values ​​extracted from the source point cloud data into the gated loop model for processing to obtain the output value;

[0074] Step 1242: Determine the first deformation field based on the output value.

[0075] In this embodiment, feature extraction is performed on source point cloud data and standard point cloud data to obtain feature results. The feature results are then recursively updated using a gated loop model to obtain the first deformation field. This approach is more sensitive to the local feature information of the point cloud data and can improve the registration effect for large deformations at joints.

[0076] It should be noted that the combined weights and each rigid transformation can be solved iteratively using GRU (GateRecurrentUnit, a recurrent neural network framework). In each iteration, the LDGCNN network and Transformer can be used to extract local features, obtaining the deformation field after deep learning registration, i.e., the first deformation field. Then, the deformation field obtained based on probability, i.e., the second deformation field, guides the deep learning deformation field registration, thereby improving the registration effect of human joint hinges and realizing the differentiation of human body parts when limbs are occluded.

[0077] like Figure 2 As shown, in another optional embodiment of the present invention, the process of determining the first deformation field may include:

[0078] In stage k, LDGCNN (Point Cloud Link Dynamic Feature Extraction Network) and Transformer (Point Cloud Correspondence Extraction Network) are used to extract data from the source point cloud data Sk. -1 The first feature result, with a size of M×C, is extracted from the source point cloud data T. The second feature result, with a size of N×C, is extracted from the standard point cloud data T, where M is the feature dimension of the source point cloud data, N is the feature dimension of the standard point cloud data, and C is the feature channels. Then, a dot product operation is performed to calculate the correlation tensor, with a tensor size of M×N. To eliminate the dependence on the number of target points N, a top-K (fast selection algorithm) operation can be used. Finally, the correlation Ck between the source point cloud data and the standard point cloud data is obtained. -1 The obtained correlation Ck -1 The hidden state value hk output by the next node -1 The geometric features fs extracted from the source point cloud data are simultaneously input into the GRU network. The GRU output is the output value hk, which is then used by two MLPs (Multilayer Perceptrons) to predict... and Ψk, among which, Let Ψk be the point weight of the k-th rigid transformation in the k-th stage, and Ψk be the first deformation field of the k-th rigid deformation, according to... The first deformation field of the k-th iteration can be obtained.

[0079] In this embodiment, because and The number of variables is large, where Ψr is the first deformation field of the r-th rigid deformation, wr is the skin weight of the r-th rigid deformation, and K represents the non-rigid as K rigid transformations, including M×K skin weights and 6K rigid deformations. Therefore, the rigid transformations and skin weights can be gradually regressed through the above cyclic update strategy.

[0080] Furthermore, the first deformation field of the k-th stage... The k-th stage probability-based deformation field, i.e., the second deformation field, is fused to obtain the target point cloud data Sk.

[0081] It should be noted that LDGCNN (Point Cloud Link Dynamic Feature Extraction Network) and Transformer (Point Cloud Correspondence Extraction Network) are used to extract features and correspondences of point cloud data. LDGCNN uses K-NN (classification algorithm) and MLP with parameter sharing to extract local features from the center point and its neighbors. Because the neighbor index based on the current feature is different from the neighbor index based on the previous feature when extracting features, new features can be learned by extracting edges from the previous features through the current neighbor index, which improves the performance of feature extraction.

[0082] In another optional embodiment of the present invention, in step 12, the probability registration model is obtained through the following process:

[0083] Step 125: Determine the joint probability density function according to preset conditions; here, the preset conditions may include: all Gaussian components are independently distributed, are isotropic, and have the same variance and weight.

[0084] Step 126: Determine the values ​​of the latent parameters of the joint probability density function using a target clustering algorithm;

[0085] Step 127: Determine the probability registration model based on the value of the latent parameter.

[0086] Specifically, step 127 may include:

[0087] Step 1271: Based on the value of the latent parameter, obtain the posterior probability;

[0088] Step 1272: Determine the probability registration model based on the posterior probability and the preset constraints.

[0089] In this embodiment, by determining the probabilistic registration model through preset constraints, the overall spatial connectivity of point cloud data can be guaranteed during registration, and the local neighborhood structure in the latent space can be maintained, thereby improving the registration effect of high hinge joints and local limb occlusion parts of the human body.

[0090] In another optional embodiment of the present invention, the specific determination process of the probability registration model may include:

[0091] Based on the preset conditions, namely that all Gaussian components are independently distributed and are isotropic with variance σ 2 With the same weights, the joint probability density function GMM is determined as follows:

[0092]

[0093] Where pT is the joint probability density function of the GMM, pt n For the probability density of standard point cloud data, π m For weighted scalars, pt n |m represents a uniform distribution, used to account for noise, and T represents standard point cloud data, t n Let M be the number of points in the standard point cloud data, M be the number of points in the source point cloud data, N be the number of points in the standard point cloud data, m be the m-th point in the source point cloud data, and n be the n-th point in the standard point cloud data.

[0094] in, m = M + 1, π M+1 =ω;

[0095] The registration problem is transformed into minimizing the negative log-likelihood of the joint probability density function. According to the EM (Expectation Maximization) algorithm for GMM clustering based on the joint probability density function, the goal of the E-step is to calculate the values ​​of the latent parameters, that is, to calculate the probability of each point belonging to each Gaussian model. The M-step is to find the parameters corresponding to maximizing the likelihood function. The E-step can be expressed as:

[0096]

[0097] in, And pold(m|tn) are posterior probabilities, which can be calculated using the old parameter GMM based on Bayesian rules:

[0098]

[0099] In the M-step, by minimizing Qθ,σ 2 The new GMM parameters are found, and the process is iteratively executed alternately between the E and M steps until convergence. Two constraints are incorporated during the iterative optimization process to achieve robust and accurate registration of non-rigid high-hinge and limb occlusion, while simultaneously preserving global topological and local structural features.

[0100] Specifically, it still uses the idea of ​​CPD (probabilistic graphical model), where the non-rigid deformation field is defined as a displacement function derived using the GRBF-base (radial basis function neural network) method through variational methods: Among them, GM ×N It contains elements The Gaussian kernel matrix, β is the width of the Gaussian filter used to control local regularization, WM ×D It is the weight matrix of the Gaussian kernel; and the motion coherence is enhanced by regularizing the weight matrix, thereby preserving global topological information:

[0101] Eglobal(W=Tr(WTGW

[0102] Global constraints can guarantee the overall spatial connectivity of the point cloud during registration, but they may be counterproductive when dealing with local deformations such as high hinges and limb occlusion. Therefore, a local constraint is needed to preserve the local neighborhood structure and is suitable for improving high hinges and limb occlusion. The nonlinear dimensionality reduction method of LLE is used to preserve the local neighborhood structure in the low-dimensional latent space. The LLE method is applied to non-rigid registration in three steps: First, the K nearest neighbors of each point in S are calculated based on Euclidean distance. Second, each point in S is represented by a weighted linear combination of adjacent points, with the weights obtained by minimizing the cost function. Third, the Gaussian kernel weight matrix W is calculated to represent the non-rigid transformation matrix of each point. W can reconstruct the topological information of each point using neighborhood information with weights Lmi after the non-rigid transformation. Lmi is an M×N matrix containing neighborhood information of each point in S, and W can be estimated by minimizing the cost function.

[0103]

[0104] Where G(m,·) is the m-th row of G, the local constraint terms based on LLE are complementary to the global constraints of CPD, and can preserve the local and global topological feature information of complex non-rigid high hinge and limb occlusion registration.

[0105] In this embodiment, the non-rigid registration of point clouds is transformed into a GMM density estimation problem using a probabilistic approach. Based on this, two regularization terms are added to ensure global and local topology. The objective function of the probabilistic registration model is:

[0106]

[0107] in, λ is a trade-off parameter between the two topological constraint terms of GMM, and λ is another trade-off parameter between the two topological constraint terms of GMM. Global topology can maintain the overall spatial connectivity of the point set during the registration process. However, it may be counterproductive when dealing with high hinge joints or local deformations. Therefore, local topology is added on this basis to maintain the local adjacency structure and adapt to non-rigid hinge deformations. The LLE algorithm used for local topology is a nonlinear dimensionality reduction algorithm that can maintain the local neighborhood structure in the low-dimensional latent space, solve the problems of high hinge joints and local deformations, and improve the registration effect for local limb occlusion parts.

[0108] In another optional embodiment of the present invention, step 12 may include:

[0109] Step 128, through The source point cloud data and the standard point cloud data are processed to obtain the second deformation field;

[0110] in, For the second deformation field, p old m|t n Let m be the first posterior probability, m be the number of source point cloud data points, and t be the first posterior probability. n For the nth point in the standard point cloud data, s n Let Gm be the nth point in the source point cloud data, · be the mth row of the Gaussian kernel matrix, W be the weight matrix of the Gaussian kernel, and σ be the weight matrix of the Gaussian kernel. 2 For isovariance, N p D is the second posterior probability, lnσ 2 Take the logarithm of the variance for isotropic variance. E is a local constraint tradeoff parameter. global W represents the local constraint, λ represents the global constraint trade-off parameter, and E represents the global constraint. local W is the global constraint trade-off parameter.

[0111] Wherein, the first posterior probability p old m|t n and the second posterior probability N p D is calculated based on Bayesian rules.

[0112] In this embodiment, the second deformation field is obtained by processing the source point cloud data and the standard point cloud data. This calculation method is simple, efficient, accurate, and requires little computation.

[0113] In another optional embodiment of the present invention, step 13 may include:

[0114] Step 131: The first deformation field and the second deformation field are fused to obtain the third deformation field;

[0115] Step 132: Determine the target point cloud data of the human body based on the third deformation field.

[0116] In this embodiment, a total deformation field, namely the third deformation field, is determined by fusing the first and second deformation fields. The target point cloud data of the human body is then determined through the third deformation field. This can simultaneously ensure the local adjacency structure and the global topology, thereby improving the effect of high hinge joints and joint occlusion registration of the human body.

[0117] In another optional embodiment of the present invention, the specific process of fusing the first deformation field and the second deformation field to determine the target point cloud data of the human body may include:

[0118] The goal of fusing the first deformation field and the second deformation field using source point cloud data S and standard point cloud data T is to obtain a total deformation field, namely the third deformation field. This makes Tx = Sx + fx. WarpS,f is defined as warping the source point cloud data S according to f, i.e., warpS,fx = Sx + fx.

[0119] The above formula transforms registration into finding f that maximizes the similarity between the standard point cloud data T and warp S,f. Assume the first deformation field... This represents the deformation field obtained using the improved RMA-Net network, the second deformation field. The specific derivation process for the deformation field obtained through probability registration includes:

[0120]

[0121] The third deformation field can be obtained from the above equation:

[0122]

[0123] in, For the third deformation field, For the first deformation field, Let S be the second deformation field, warp be the warp function, S be the source point cloud data, and x be the variable.

[0124] In this embodiment, the above formula also applies to the deformation field of point cloud registration, that is, the fusion of the first deformation field. The probabilistic registration model-guided registration method can improve the generalization of the depth model. After deformation field fusion, the total deformation field, that is, the third deformation field, can be obtained, thereby calculating the target point cloud data. The target point cloud data can participate in the next stage of iteration until the result converges.

[0125] In another optional embodiment of the present invention, the method for processing the human body point cloud data may further include:

[0126] Step 14: Determine the similarity between the target point cloud data and the standard point cloud data using a loss function.

[0127] In a specific implementation, step 14 may include:

[0128] Step 141, through Determine the similarity between the target point cloud data and the standard point cloud data;

[0129] Where Lk is the loss function, Ldepth is the loss function representing the depth map, β1 is the mask loss function trade-off parameter, Lmask is the binary mask loss function, β2 is the edge length loss function trade-off parameter, Larap is the edge length loss function, β3 is the rigid transformation vector loss function trade-off parameter, Ltran is the rigid transformation vector loss function, β4 is the skin weight sparse term weight trade-off parameter, Lsparse is the skin weight sparse term weight loss function, Sk is the target point cloud data obtained in the k-th iteration, T is the target point cloud data, and tk is the translation vector in the k-th iteration. Let be the point weight of the k-th rigid transformation in the k-th iteration.

[0130] In this embodiment, through The similarity between the target point cloud data and the standard point cloud data is determined; the calculation is simple, requires little computation, and is highly efficient.

[0131] In another optional embodiment of the present invention, the loss function may specifically include:

[0132] A loss function can be constructed based on shape similarity metrics, projecting the 3D shape onto a 2D plane with multiple views. The similarity between source and target point cloud data is measured using the projected mask image and 2D depth image. Furthermore, regularization terms for deformation field variables and skin weights are included. Sk represents Sk after the deformation of wheel k.

[0133] Depth loss function: For point cloud data P and a given viewpoint v, transform P into camera coordinate system Pv and calculate its depth map D(Pv). Similarly, calculate the depth map D(Pv) of the source point cloud data. The depth map of standard point cloud data T is and DT v The depth loss function can be defined as:

[0134]

[0135] Where V represents a set of camera viewpoints;

[0136] Masking loss function: By projecting a three-dimensional surface onto a two-dimensional plane, its 2D binary mask value MP can also be obtained. v For MP v Each pixel on P v If the distance to the projected image is less than a given threshold, the mask value is 1; otherwise, it is 0. (Source point cloud data) The mask values ​​of the standard point cloud data T are respectively and MT v Then the mask loss function can be defined as:

[0137]

[0138] The global mask loss function can be backpropagated in the following way. Let ci be... The value of the upper pixel pi, The gradient of ci can be calculated using the following formula. gradient:

[0139]

[0140] in, The gradient of the z-coordinate. To control the sharpness of the damage, For pij and The squared distance between projections (to the x0y plane) is affected by the mask loss of a single pixel in the source point cloud data due to the use of a soft rasterization strategy. All points.

[0141] Edge length loss function: The edge length of the deformed point cloud data should be close to the edge length of the original point cloud data. Specifically, it can be defined as:

[0142]

[0143] Where ε represents the source point cloud data. The edge set is constructed by KNN, and dij is the distance between vertex pairs in the source point cloud data.

[0144] Regularization term: The rigid transformation Ψk includes the rotation matrix Rk and the translation vector tk, constraining the transformation vector as follows:

[0145]

[0146] Because of the high hinge ratio at non-rigid joints in the human body, a sparse term is added to the weights of the skin layer:

[0147]

[0148] It is the point weight value of the kth rigid transformation in the kth iteration;

[0149] At stage k, the total loss function is:

[0150]

[0151] The final loss function is a combination of all stages.

[0152] Wherein, represents the weight for exponential growth in the later stage. In the above embodiments of the present invention, complex non-rigid problems are solved by decomposing non-rigid transformations into point-direction combinations of several rigid transformations. Specifically, the non-rigid transformation is represented as a point-direction combination of K rigid transformations, where K is much smaller than the number of surface vertices.

[0153] This approach not only approximates any non-rigid transformation but also provides a good constraint on the solution space. A GRU recurrent neural network framework is used to iteratively solve for the combined weights and each rigid transformation.

[0154] In each iteration, the LDGCNN network and Transformer are used for local feature extraction. Experiments have shown that the algorithm proposed in this application is more sensitive to local feature information of point cloud data than previous methods, and can better solve the problem of large deformation registration at joints.

[0155] To avoid occlusion by human limbs, this application further proposes a deep learning network registration method based on probability registration guidance.

[0156] Probabilistic registration preserves local structural features while also considering the global topological information of the human body. It can be used to guide deep learning networks that are sensitive to local features. It can take into account both global topology and local features, thus improving the registration effect of joint hinges and solving the problem of limb occlusion.

[0157] The registration algorithm in this application first inputs the source point cloud data and the target point cloud data into a deep learning framework to obtain the deformation field. The deep learning framework can extract the boundary and sharp features of the human body in the point cloud, thereby improving the registration effect of joint hinges. However, the registration effect is not obvious for occluded parts of the limbs with indistinct boundaries. Therefore, a registration based on probabilistic registration is further proposed to guide the deep learning framework. The probabilistic registration module can preserve the global topological structure of the human body point cloud data. The deep learning framework is sensitive to local information of the human body. The fusion of the deformation field obtained by deep learning and the deformation field obtained by probabilistic registration can simultaneously maintain the local adjacency structure and the global topology, thereby improving the registration effect of high hinges and joint occlusion.

[0158] like Figure 4 As shown, embodiments of the present invention also provide a human point cloud data processing device 40, the device 40 comprising:

[0159] The acquisition module 41 is used to acquire source point cloud data and standard point cloud data of the human body;

[0160] The processing module 42 is used to obtain a first deformation field and a second deformation field based on the source point cloud data and the standard point cloud data, respectively; the first deformation field is obtained through a deep linked dynamic network; the second deformation field is obtained by processing the source point cloud data and the standard point cloud data through the probabilistic registration model; and the target point cloud data of the human body is determined based on the first deformation field and the second deformation field.

[0161] Optionally, based on the source point cloud data and the standard point cloud data, the first deformation field is obtained, including:

[0162] The first feature result is obtained by performing local feature extraction processing on the source point cloud data;

[0163] The second feature result is obtained by performing local feature extraction processing on the standard point cloud data;

[0164] Based on the first feature result and the second feature result, the correlation result between the source point cloud data and the standard point cloud data is determined;

[0165] Based on the correlation results, the first deformation field is determined.

[0166] Optionally, determining the first deformation field based on the correlation result includes:

[0167] The correlation results, the hidden state value output by the previous node, and the geometric feature values ​​extracted from the source point cloud data are input into the gated loop model for processing to obtain the output value.

[0168] The first deformation field is determined based on the output value.

[0169] Optionally, the probability registration model is obtained through the following process:

[0170] Determine the joint probability density function based on preset conditions;

[0171] The values ​​of the latent parameters of the joint probability density function are determined using a target clustering algorithm.

[0172] The probability registration model is determined based on the value of the hidden parameter.

[0173] Optionally, determining the probability registration model based on the values ​​of the latent parameters includes:

[0174] Based on the value of the latent parameter, the posterior probability is obtained;

[0175] The probability registration model is determined based on the posterior probability and the preset constraints.

[0176] Optionally, the source point cloud data and the standard point cloud data are processed using the probability registration model to obtain a second deformation field, including:

[0177] pass The source point cloud data and the standard point cloud data are processed to obtain the second deformation field;

[0178] in, For the second deformation field, p old m|t n Let m be the first posterior probability, m be the number of source point cloud data points, and t be the first posterior probability. n For the nth point in standard point cloud data, s n Let Gm be the nth point in the source point cloud data, · represent the mth row of the Gaussian kernel matrix, W be the weight matrix of the Gaussian kernel, and σ be the weight matrix of the Gaussian kernel. 2 For isotropic variance, N p D is the second posterior probability, lnσ 2 Take the logarithm of the variance for isotropic variance. E is a trade-off parameter for local constraints. global W represents the local constraint, λ represents the trade-off parameter for the global constraint, and E represents the global constraint. local W represents a global constraint.

[0179] Optionally, based on the first deformation field and the second deformation field, the target point cloud data of the human body is determined, including:

[0180] The first deformation field and the second deformation field are fused to obtain the third deformation field;

[0181] Based on the third deformation field, the target point cloud data of the human body is determined.

[0182] Optionally, the processing module 42 can also be used for:

[0183] The similarity between the target point cloud data and the standard point cloud data is determined by a loss function.

[0184] Optionally, the similarity between the target point cloud data and the standard point cloud data is determined using a loss function, including:

[0185] pass Determine the similarity between the target point cloud data and the standard point cloud data;

[0186] Where Lk is the loss function, Ldepth is the loss function representing the depth map, β1 is the mask loss function trade-off parameter, Lmask is the binary mask loss function, β2 is the edge length loss function trade-off parameter, Larap is the edge length loss function, β3 is the rigid transformation vector loss function trade-off parameter, Ltran is the rigid transformation vector loss function, β4 is the skin weight sparse term weight trade-off parameter, Lsparse is the skin weight sparse term weight loss function, Sk is the target point cloud data obtained in the k-th iteration, T is the target point cloud, and tk is the translation vector in the k-th iteration. Let be the point weight of the k-th rigid transformation in the k-th iteration.

[0187] It should be noted that this device is the same as the method described above. All implementations in the above method embodiments are applicable to the embodiments of this device and can achieve the same technical effect.

[0188] Embodiments of the present invention also provide a computing device, including: a processor and a memory storing a computer program, wherein the computer program, when executed by the processor, performs the method described above. All implementations in the above method embodiments are applicable to this embodiment and can achieve the same technical effects.

[0189] Embodiments of the present invention also provide a computer-readable storage medium including instructions that, when executed on a computer, cause the computer to perform the method described above. All implementations in the above method embodiments are applicable to this embodiment and can achieve the same technical effects.

[0190] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.

[0191] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0192] In the embodiments provided by this invention, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative. For instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, or other forms.

[0193] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0194] In addition, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.

[0195] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, essentially, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, ROM, RAM, magnetic disks, or optical disks.

[0196] Furthermore, it should be noted that in the apparatus and method of the present invention, it is obvious that the components or steps can be decomposed and / or recombined. These decompositions and / or recombinations should be considered equivalent solutions of the present invention. Moreover, the steps performing the above-described series of processes can naturally be executed in the order described, but are not necessarily required to be executed in chronological order; some steps can be executed in parallel or independently of each other. Those skilled in the art will understand that all or any step or component of the method and apparatus of the present invention can be implemented in any computing device (including processors, storage media, etc.) or network of computing devices, in hardware, firmware, software, or a combination thereof. This is something that those skilled in the art can achieve by using their basic programming skills after reading the description of the present invention.

[0197] Therefore, the object of the present invention can also be achieved by running a program or a set of programs on any computing device. The computing device can be a known general-purpose device. Therefore, the object of the present invention can also be achieved simply by providing a program product containing program code implementing the method or apparatus. That is, such a program product also constitutes the present invention, and the storage medium storing such a program product also constitutes the present invention. Obviously, the storage medium can be any known storage medium or any storage medium developed in the future. It should also be noted that in the apparatus and method of the present invention, it is obvious that the components or steps can be decomposed and / or recombined. These decompositions and / or recombinations should be considered equivalent to the present invention. Furthermore, the steps performing the above series of processes can naturally be performed in the order described, but are not necessarily required to be performed in chronological order. Some steps can be performed in parallel or independently of each other.

[0198] The above description represents the preferred embodiments of the present invention. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principles of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.

Claims

1. A method for processing human point cloud data, characterized in that, The method includes: Acquire source point cloud data and standard point cloud data of the human body; Based on the source point cloud data and the standard point cloud data, a first deformation field and a second deformation field are obtained, respectively; the first deformation field is obtained through a deep linked dynamic network; the second deformation field is obtained by processing the source point cloud data and the standard point cloud data through a probabilistic registration model. Based on the first deformation field and the second deformation field, determine the target point cloud data of the human body; The first deformation field is obtained based on the source point cloud data and the standard point cloud data, including: The source point cloud data is processed by the point cloud link dynamic feature extraction network LDGCNN and the point cloud correspondence extraction network Transformer to extract local features, and the first feature result is obtained. The standard point cloud data is processed by the point cloud link dynamic feature extraction network LDGCNN and the point cloud correspondence extraction network Transformer to extract local features and obtain the second feature result. Based on the first feature result and the second feature result, the correlation result between the source point cloud data and the standard point cloud data is determined; Based on the correlation results, the first deformation field is determined; Determining the first deformation field based on the correlation results includes: The correlation results, the hidden state value output by the previous node, and the geometric feature values ​​extracted from the source point cloud data are input into the gated loop model for processing to obtain the output value. The first deformation field is determined based on the output value; Specifically, the source point cloud data and the standard point cloud data are processed using a probability registration model to obtain the second deformation field, including: pass The source point cloud data and the standard point cloud data are processed to obtain the second deformation field; in, For the second deformation field, Let m be the first posterior probability, and m be the number of source point cloud data. For the nth point in the standard point cloud data, For the nth point in the source point cloud data, Let m be the m-th row of the Gaussian kernel matrix, and W be the weight matrix of the Gaussian kernel. For isotropic variance, This is the second posterior probability. Take the logarithm of the variance for isotropic variance. For the trade-off parameters of global constraints, For global constraints, For the trade-off parameters of local constraints, For local constraints; The determination of target point cloud data of the human body based on the first deformation field and the second deformation field includes: The first deformation field and the second deformation field are fused to obtain the third deformation field; Based on the third deformation field, determine the target point cloud data of the human body; The third deformation field is represented by a point-direction combination of K rigid transformations, and is expressed as follows: ; in, Let S be the total rigid deformation field, i.e., the third deformation field, and S be the source point cloud data. The skin weights for the k-th rigid transformation. Let K be the deformation field of the k-th rigid transformation, where K represents the non-rigid transformation as K rigid transformations, and r is the r-th rigid transformation. The method for fusing the first deformation field and the second deformation field is as follows: ; in, For the third deformation field, For the first deformation field, The second deformation field is represented by warp, which is the warp function. Where S is the source point cloud data and x is a variable.

2. The method for processing human point cloud data according to claim 1, characterized in that, The probability registration model is obtained through the following process: Determine the joint probability density function based on preset conditions; The values ​​of the latent parameters of the joint probability density function are determined using a target clustering algorithm. The probability registration model is determined based on the value of the hidden parameter.

3. The method for processing human point cloud data according to claim 2, characterized in that, Determining the probability registration model based on the values ​​of the latent parameters includes: Based on the value of the latent parameter, the posterior probability is obtained; The probability registration model is determined based on the posterior probability and the preset constraints.

4. The method for processing human point cloud data according to claim 1, characterized in that, Also includes: The similarity between the target point cloud data and the standard point cloud data is determined by a loss function.

5. The method for processing human point cloud data according to claim 4, characterized in that, Determining the similarity between the target point cloud data and the standard point cloud data using a loss function includes: pass The similarity between the target point cloud data and the standard point cloud data is determined. in, For loss function, To represent the loss function of the depth map, The parameters to be traded for the mask loss function, For binary mask loss function, The parameters are traded for the edge length loss function. The edge length loss function is... For the loss function of rigid transformation vector, the parameters are trade-offs. For rigid transformation vector loss function, For the sparse term weighting parameters of the skin weights, For the skin weight sparse term weight loss function, For the target point cloud data obtained in the k-th iteration, For the target point cloud, Let be the translation vector in the k-th iteration. Let be the point weight of the k-th rigid transformation in the k-th iteration.

6. A device for processing human point cloud data, characterized in that, The device includes: The acquisition module is used to acquire source point cloud data and standard point cloud data of the human body; The processing module is used to obtain a first deformation field and a second deformation field based on the source point cloud data and the standard point cloud data, respectively; the first deformation field is obtained through a deep linked dynamic network; the second deformation field is obtained by processing the source point cloud data and the standard point cloud data through a probabilistic registration model; and the target point cloud data of the human body is determined based on the first deformation field and the second deformation field. The first deformation field is obtained based on the source point cloud data and the standard point cloud data, including: The source point cloud data is processed by the point cloud link dynamic feature extraction network LDGCNN and the point cloud correspondence extraction network Transformer to extract local features, and the first feature result is obtained. The standard point cloud data is processed by the point cloud link dynamic feature extraction network LDGCNN and the point cloud correspondence extraction network Transformer to extract local features and obtain the second feature result. Based on the first feature result and the second feature result, the correlation result between the source point cloud data and the standard point cloud data is determined; Based on the correlation results, the first deformation field is determined; Determining the first deformation field based on the correlation results includes: The correlation results, the hidden state value output by the previous node, and the geometric feature values ​​extracted from the source point cloud data are input into the gated loop model for processing to obtain the output value. The first deformation field is determined based on the output value; Specifically, the source point cloud data and the standard point cloud data are processed using a probability registration model to obtain the second deformation field, including: pass The source point cloud data and the standard point cloud data are processed to obtain the second deformation field; in, For the second deformation field, Let m be the first posterior probability, and m be the number of source point cloud data. For the nth point in the standard point cloud data, For the nth point in the source point cloud data, Let m be the m-th row of the Gaussian kernel matrix, and W be the weight matrix of the Gaussian kernel. For isotropic variance, This is the second posterior probability. Take the logarithm of the variance for isotropic variance. For the trade-off parameters of global constraints, For global constraints, For the trade-off parameters of local constraints, For local constraints; The determination of target point cloud data of the human body based on the first deformation field and the second deformation field includes: The first deformation field and the second deformation field are fused to obtain the third deformation field; Based on the third deformation field, determine the target point cloud data of the human body; The third deformation field is represented by a point-direction combination of K rigid transformations, and is expressed as follows: ; in, Let S be the total rigid deformation field, i.e., the third deformation field, and S be the source point cloud data. The skin weights for the k-th rigid transformation. Let K be the deformation field of the k-th rigid transformation, where K represents the non-rigid transformation as K rigid transformations, and r is the r-th rigid transformation. The method for fusing the first deformation field and the second deformation field is as follows: ; in, For the third deformation field, For the first deformation field, The second deformation field is represented by warp, which is the warp function. Where S is the source point cloud data and x is a variable.