A point cloud segmentation transfer learning method based on kernel attention convolution

By employing a point cloud segmentation transfer learning method based on kernel attention convolution, and utilizing the KC and P-GAT modules to construct a point cloud segmentation network, the problem of poor adaptability of tetrapod point cloud data segmentation models was solved, achieving efficient point cloud data segmentation and saving manpower and time.

CN116363153BActive Publication Date: 2026-06-26SOUTH CHINA AGRICULTURAL UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SOUTH CHINA AGRICULTURAL UNIVERSITY
Filing Date
2023-03-28
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

In existing technologies, tetrapod point cloud segmentation models are poorly adapted to the task of segmenting parts of point cloud data of tetrapods with similar body shapes. They require extensive annotation work on point cloud data of specific animals to retrain the network model, resulting in a waste of manpower and time.

Method used

A point cloud segmentation transfer learning method based on kernel attention convolution is adopted. A point cloud data segmentation network model is constructed by designing KC module and P-GAT module. Transfer learning is used to transfer the labeled point cloud data segmentation network model to unlabeled point cloud data. Combining downsampling and upsampling, local neighborhood features of point cloud are extracted and segmented.

Benefits of technology

It improves the robustness of point cloud data segmentation network models, reduces the need for segmentation of unlabeled point cloud data, saves a lot of manpower and time, and is suitable for segmentation tasks of four-limbed animals with similar body size.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116363153B_ABST
    Figure CN116363153B_ABST
Patent Text Reader

Abstract

The application discloses a kind of point cloud segmentation transfer learning methods based on nuclear attention convolution, as follows: obtain two different categories of livestock three-dimensional point cloud data;Design and propose KC module and P-GAT module for extracting local neighborhood features of point cloud, and construct point cloud data segmentation network model based on KC module and P-GAT module;The first category of livestock three-dimensional point cloud data that completes artificial labeling is input into point cloud data segmentation network model for training;The trained point cloud data segmentation network model is migrated to the second category of livestock three-dimensional point cloud data that has not been artificially labeled using transfer learning method, to realize the partial segmentation of the second category of livestock three-dimensional point cloud data.The application uses the point cloud data segmentation network model that training is completed to segment other categories of point cloud data that have not been manually labeled, improves the robustness of point cloud data segmentation network model, and saves a lot of manpower and time.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of computer image processing technology, and more specifically, to a point cloud segmentation transfer learning method based on kernel attention convolution. Background Technology

[0002] In precision cattle breeding, body size data is of significant reference value, making accurate data collection crucial. Manually measuring a cow's body size requires the use of measuring tapes and measuring rods to measure each part individually, often taking 3-5 minutes per cow. This method suffers from low accuracy, high labor intensity, and is prone to causing stress in the cattle. Therefore, acquiring three-dimensional point cloud data of the cattle body for body size measurement using non-contact methods has become a hot topic in livestock breeding and farming research. Three-dimensional point cloud body size measurement typically involves calculating the entire point cloud. Firstly, the overall point cloud data is very large, and secondly, it is easily affected by the posture and position of various parts during movement, leading to errors in locating key measurement points. If the cattle body point cloud is segmented into different parts—the head, limbs, torso, and tail—it not only helps in locating key points for body size measurement but also allows for the calculation of volumes and other phenotypic data that are difficult to obtain manually.

[0003] Deep learning has made groundbreaking progress in many fields, including point cloud segmentation. The overall trend in its development is from two-dimensional to three-dimensional, and from data transformation to direct processing. Deep learning-based point cloud segmentation methods can be mainly divided into projection- and multi-view-based methods, voxel-based methods, and methods that directly apply to the point cloud. Methods that directly apply to the point cloud take irregular raw point cloud data as input and directly extract features from the raw point cloud. This method has a simple network structure and good segmentation results, and is currently the mainstream direction for point cloud segmentation.

[0004] For pig point cloud segmentation networks, some literature has proposed using PointNet++. However, building deep point cloud segmentation models like PointNet++ requires training on a large amount of manually segmented datasets. Different types of livestock, such as pigs, cattle, and sheep, cannot be segmented directly using other livestock segmentation models. Even among cattle, different breeds have significant morphological differences and cannot be segmented using the same model. This makes point cloud segmentation networks trained on a single breed unable to adapt well to different types of livestock. It necessitates retraining the network on large amounts of labeled segmented data for different livestock species, easily resulting in a significant waste of manpower and time. Summary of the Invention

[0005] To address the problem that existing point cloud segmentation models for tetrapods are poorly adapted to component segmentation tasks using point cloud data of tetrapods with similar body shapes, requiring extensive annotation work on specific animal point cloud data for retraining, this invention provides a point cloud segmentation transfer learning method based on kernel attention convolution.

[0006] To achieve the above-mentioned objectives of this invention, the technical solution adopted is as follows:

[0007] A point cloud segmentation transfer learning method based on kernel attention convolution, comprising the following steps:

[0008] Acquire 3D point cloud data of two different types of livestock;

[0009] This paper designs and proposes KC and P-GAT modules for extracting local neighborhood features from point clouds, and constructs a point cloud data segmentation network model based on the KC and P-GAT modules. The point cloud segmentation network model includes a downsampling part and an upsampling part. First, the downsampling part is used to extract features and downsample the livestock 3D point cloud data, which will generate a set of center points with high-dimensional features. Then, the upsampling part is used to backpropagate the high-dimensional features of the center points to the original livestock 3D point cloud, and finally achieves partial segmentation of the original livestock 3D point cloud.

[0010] The three-dimensional point cloud data of the first category of livestock, which has been manually labeled, is input into the point cloud data segmentation network model for training.

[0011] The training point cloud data segmentation network model was transferred to the second category of livestock 3D point cloud data that had not yet been manually labeled, using the transfer learning method to achieve partial segmentation of the second category of livestock 3D point cloud data.

[0012] Preferably, the KC module comprises the following steps:

[0013] D1: Get x i The local neighborhood N′ of the point cloud centered on the point cloud;

[0014] D2: Initialize a set of M learnable points K as kernel points;

[0015] D3: Select the distance kernel function K l (k,δ) is used to represent the relationship between the kernel point K and the local neighborhood N′;

[0016] D4: Substitute the distance kernel function to calculate the kernel correlation KC between kernel point K and local neighborhood N′.

[0017] Furthermore, the distance kernel function K l The specific formula for (k,δ) is as follows:

[0018]

[0019] In the formula, max() represents the maximum value function; ||k-δ|| represents the Euclidean distance between k and δ; k and δ are the three-dimensional coordinates of two points in any three-dimensional space; σ is the kernel width that controls the influence of the distance between the two points.

[0020] Furthermore, the specific calculation process of the nuclear correlation KC is as follows:

[0021]

[0022] In the formula, k m Let N(i) represent the m-th learnable point in kernel point K; N(i) represent x. i neighborhood index set; x n x represents i One of the neighborhood points; x n -x i Represents the neighborhood points and the center point x i Local relationships between them.

[0023] Preferably, the P-GAT module is described in the following steps:

[0024] N1: Get x i The local neighborhood of the point cloud centered at the center, with high-dimensional features, is represented as h = {h1, h2, ..., h...} N}, h j ∈R F , where h N R represents the feature of the Nth point in the neighborhood. F Represents the F-dimensional feature space;

[0025] N2: Construct a graph structure G for the local neighborhood points h of the point cloud, where each neighborhood point is a vertex in graph G, and the distance from each neighborhood point to the center point x is [missing information]. i The relationship is represented by the edges in graph G;

[0026] N3: Introduce the graph attention mechanism GAT and modify it according to the point cloud data structure to find the neighborhood point x. j For the center point x i Attention coefficient α ij ;

[0027] N4: Through attention coefficient α ij Find the center point x in the local neighborhood of the point cloud. i The new feature h′ i .

[0028] Furthermore, the attention coefficient α ij The calculation formula is as follows:

[0029]

[0030] In the formula, N i Let j be the set of neighborhood point indices; j is the index of a certain neighborhood point; e ij x represents j and x i The attention value between two points; softmax() represents the normalization function;

[0031] Among them, e ij The expression is as follows:

[0032]

[0033] In the formula, (h j -h i ) represents the edge relationship between a neighboring vertex and the center vertex in a graph structure G; || represents the concatenation operation; W∈R F×F′ Represents the weight matrix; This represents a weight vector; LeakReLU() represents a nonlinear function. T This indicates transpose.

[0034] Furthermore, the center point x within the local neighborhood of the point cloud i The new feature h′ i The calculation formula is as follows:

[0035]

[0036] In the formula, ρ is a nonlinear function.

[0037] Furthermore, the specific steps of the downsampling section are as follows:

[0038] E1: Preset the depth of the downsampling layer;

[0039] E2: Use the farthest point sampling algorithm to sample a set of center points from the point cloud data;

[0040] E3: Using the center point of the sampling as the center point of the local neighborhood of the point cloud, the point cloud data is divided into local neighborhoods;

[0041] E4: Extract features from each local neighborhood of the divided point cloud, aggregate the extracted features to the center point, realize downsampling of the point cloud, and output a new set of point clouds;

[0042] E5: Repeat steps E1 to E3 for the output set of new point clouds until the preset downsampling layer depth is reached, and finally output the final downsampled set of point clouds.

[0043] Furthermore, feature extraction is performed on each of the divided local neighborhoods of the point cloud, and the feature extraction of each local neighborhood of the point cloud is as follows:

[0044] F1: Input is N×F 1 The matrix, where N represents the number of points in each local neighborhood, F 1 This represents the feature dimension of each point after feature extraction from the previous layer.

[0045] F2: First, the features are mapped to a high-dimensional space using an MLP (Multilayer Perceptron);

[0046] F3: The KC module is used to calculate the kernel correlation between the kernel points and the local neighborhood points obtained in F2, thus completing the first feature extraction of the local neighborhood of the point cloud;

[0047] F4: The local neighborhood of the point cloud obtained in F3 is further input into three MLPs with different layers, and the output results of the three MLPs are used to construct point clouds to obtain three graph structures G1, G2, and G3.

[0048] F5: The P-GAT module is applied to graph structures G1, G2, and G3 respectively, generating three attention coefficients. Then, a multi-head attention mechanism is used to combine the three attention coefficients. Perform aggregation and output F 2 Dimensional features h′ i The final output is 1×F 2 The matrix, i.e., the high-dimensional features of the center.

[0049] Furthermore, the aforementioned multi-head attention mechanism integrates the three attention coefficients. The aggregation is performed using the following formula:

[0050]

[0051] In the formula, K takes the value of 3. W is the k-th attention coefficient. k It is the weight matrix of the corresponding input linear transformation, and the final output is the center point feature h′. i The average of the K attention features learned by each neighboring point corresponding to the center point is used to form the value.

[0052] Furthermore, the upsampling process includes the following specific steps:

[0053] First, save the results generated by each intermediate layer during the downsampling process;

[0054] Then, the high-dimensional features extracted by downsampling are interpolated to the high-dimensional features of the previous layer points, and so on, until the original livestock 3D point cloud data is returned.

[0055] Finally, the interpolated 3D point cloud of livestock is passed through a fully connected layer and a softmax layer to score each point cloud, thereby classifying each point cloud, i.e., point cloud segmentation.

[0056] The beneficial effects of this invention are as follows:

[0057] This invention utilizes a point cloud segmentation network model with downsampling and upsampling components to downsample massive point cloud data layer by layer, enabling the network model to handle larger-scale point cloud datasets. By employing the KC and P-GAT modules to fully extract local neighborhood features of the point cloud, it addresses the issue of uneven point cloud density in real-world data acquisition. Furthermore, through transfer learning, the network model is trained using labeled point cloud data, and then used to segment other categories of unlabeled point cloud data, improving its robustness and saving significant manpower and time. Attached Figure Description

[0058] Figure 1 This is a flowchart of the point cloud segmentation transfer learning method described in Example 1.

[0059] Figure 2 This is the overall flowchart of the point cloud data segmentation network model described in Example 1.

[0060] Figure 3 This is a partial example of the segmentation results shown in Example 1, where unlabeled bovine body point cloud data is input into a point cloud data segmentation network model for testing.

[0061] Figure 4 It is Simmental bull point cloud data;

[0062] Figure 5 It is point cloud data of water buffalo. Detailed Implementation

[0063] The present invention will now be described in detail with reference to the accompanying drawings and specific embodiments.

[0064] Example 1

[0065] Because most existing public datasets for point cloud data segmentation are synthetic datasets, they differ from real point cloud datasets, especially in point cloud density. Furthermore, the data volume of a single point cloud from livestock such as pigs and cattle is far greater than that in public datasets; in addition, the amount of manual data annotation required in the early stages is enormous. To address these issues, this embodiment proposes a point cloud segmentation transfer learning method based on kernel attention convolution, such as... Figure 1 As shown, the method includes the following steps:

[0066] This embodiment uses the acquisition of 3D point cloud data of two different types of livestock as an example to illustrate the process.

[0067] This paper designs and proposes KC and P-GAT modules for extracting local neighborhood features from point clouds, and constructs a point cloud data segmentation network model based on the KC and P-GAT modules. The point cloud segmentation network model includes a downsampling part and an upsampling part. First, the downsampling part is used to extract features and downsample the livestock 3D point cloud data, which will generate a set of center points with high-dimensional features. Then, the upsampling part is used to backsample the high-dimensional features of the center points and backsample them to the original livestock 3D point cloud, thus achieving partial segmentation of the original livestock 3D point cloud.

[0068] The three-dimensional point cloud data of the first category of livestock, which has been manually labeled, is input into the point cloud data segmentation network model for training.

[0069] The training point cloud data segmentation network model was transferred to the second category of livestock 3D point cloud data that had not yet been manually labeled, using the transfer learning method to achieve partial segmentation of the second category of livestock 3D point cloud data.

[0070] This implementation involves inputting manually labeled 3D point cloud data of pigs into a point cloud data segmentation network model for training. Transfer learning is then used to transfer the trained point cloud data segmentation network model to unlabeled 3D point cloud data of cattle, achieving partial segmentation of the cattle 3D point cloud data.

[0071] This invention utilizes a point cloud segmentation network model with downsampling and upsampling components to downsample massive point cloud data layer by layer, enabling the network model to handle larger-scale point cloud datasets. By employing the KC and P-GAT modules to fully extract local neighborhood features of the point cloud, it addresses the issue of uneven point cloud density in real-world data acquisition. Furthermore, through transfer learning, the network model is trained using labeled point cloud data, and then used to segment other categories of unlabeled point cloud data, improving its robustness and saving significant manpower and time.

[0072] Therefore, based on the transfer learning method, this embodiment trains the point cloud data segmentation network model with a very small number of segmentation samples and applies it to the segmentation of bovine body point clouds. Furthermore, the method of this invention can be extended to the segmentation of point cloud data segmentation network models and extended to the segmentation of four-limbed animals with similar body shapes.

[0073] In a specific embodiment, for step S1, depth cameras are used to capture images from the left, right, and top directions. Then, the point cloud data captured from the three directions is registered using a point cloud registration method to obtain the collected three-dimensional point cloud data of the livestock.

[0074] Example 2

[0075] Based on the point cloud segmentation transfer learning method based on kernel attention convolution described in Example 1, it is necessary to design and propose a KC module and a P-GAT module for extracting local neighborhood features of point clouds, and to construct a point cloud data segmentation network model based on the KC module and the P-GAT module. The English name of the KC module is Kernel Correlation; the P-GAT module is a graph attention mechanism applied to point clouds.

[0076] The specific steps of the KC module are as follows:

[0077] D1: Get x i The local neighborhood N′ of the point cloud centered on the point cloud;

[0078] D2: Initialize a set of M learnable points K as kernel points;

[0079] D3: Select the distance kernel function K l (k,δ) is used to represent the relationship between the kernel point K and the local neighborhood N′;

[0080] D4: Substitute the distance kernel function to calculate the kernel correlation KC between kernel point K and local neighborhood N′.

[0081] In this embodiment, the selection method for the local neighborhood N′ of the point cloud is to select x i Centered on point x, a spherical neighborhood of radius R is defined by x. i Let N′ be the local neighborhood of the point cloud centered at the given point. The values ​​of parameters R and n are selected based on the specific scale of the point cloud data.

[0082] In one specific embodiment, the distance kernel function K l The specific formula for (k,δ) is as follows:

[0083]

[0084] In the formula, max() represents the maximum value function; ||k-δ|| represents the Euclidean distance between k and δ; k and δ are the three-dimensional coordinates of two points in any three-dimensional space; σ controls the kernel width affected by the distance between the two points. Note the importance of choosing the kernel width here, because σ being too large or too small will lead to poor performance. Here, the kernel width can be empirically chosen to be 5e-3.

[0085] In a specific embodiment, the specific calculation process of the kernel correlation KC is as follows:

[0086]

[0087] In the formula, k m Let N(i) represent the m-th learnable point in kernel point K; N(i) represent x. i neighborhood index set; x n x represents i One of the neighborhood points; x n -x i Represents the neighborhood points and the center point x i Local relationships between them.

[0088] With k m With x n -x i The higher the similarity between two points, the higher the correlation. Therefore, KC can be clearly interpreted as a similarity measure between two point sets. Furthermore, because the distance kernel function affects k... m It is differentiable, which means that the kernel points are learnable, and the gradients of the kernel points can be backpropagated when they learn automatically in the face of different local neighborhoods of the point cloud.

[0089] In a specific embodiment, the P-GAT module comprises the following steps:

[0090] N1: Get x i The local neighborhood of the point cloud centered at the center, with high-dimensional features, is represented as h = {h1, h2, ..., h...} N}, h j ∈R F , where h N R represents the feature of the Nth point in the neighborhood. F Represents the F-dimensional feature space;

[0091] N2: Construct a graph structure G for the local neighborhood points h of the point cloud, where each neighborhood point is a vertex in graph G, and the distance from each neighborhood point to the center point x is [missing information]. i The relationship is represented by the edges in graph G;

[0092] N3: Introduce the graph attention mechanism GAT and modify it according to the point cloud data structure to find the neighborhood point x. j For center point x i Attention coefficient α ij ;

[0093] N4: Through attention coefficient α ij Find the center point x in the local neighborhood of the point cloud. i The new feature h′ i .

[0094] In this embodiment, the attention coefficient α ij The calculation formula is as follows:

[0095]

[0096] In the formula, N i Let j be the set of neighborhood point indices; j is the index of a certain neighborhood point; e ij x represents j and x i The attention value between two points; softmax() represents the normalization function.

[0097] In this embodiment, when aggregating neighborhood information, the attention values ​​of all neighboring points and the center point need to be normalized using the softmax() function. The normalized attention weight α ij As an attention coefficient.

[0098] Among them, e ij The expression is as follows:

[0099]

[0100] In the formula, (h j -h i ) represents the edge relationship between a neighboring vertex and the center vertex in a graph structure G; || represents the concatenation operation; W∈R F×F′ Represents the weight matrix; This represents a weight vector; LeakReLU() represents a nonlinear function. T This indicates transpose.

[0101] The weight matrix W∈R F×F′ It acts as a mapping function; when calculating the attention value, the center point x is used. i With edge (x) j -x i The high-dimensional feature representation uses a weight matrix W for mapping, and concatenates the resulting vectors to form a 2F′ dimension feature, which is then weighted by a weight vector. Parameterize and apply the LeakReLU() nonlinear function activation to obtain e. ij .

[0102] In one specific embodiment, the center point x within the local neighborhood of the point cloud i The new feature h′ i The calculation formula is as follows:

[0103]

[0104] In the formula, ρ is a nonlinear function.

[0105] Example 3

[0106] Based on the point cloud segmentation transfer learning method based on kernel attention convolution described in Embodiment 1 or Embodiment 2, it is necessary to design and propose a KC module and a P-GAT module for extracting local neighborhood features of point clouds, and construct a point cloud data segmentation network model based on the KC module and the P-GAT module. The point cloud segmentation network model includes a downsampling part and an upsampling part; firstly, the downsampling part is used to extract features and downsample the livestock 3D point cloud data, which will generate a set of center points with high-dimensional features; then, the upsampling part is used to backpropagate the high-dimensional features of the center points back to the original livestock 3D point cloud, and finally achieve partial segmentation of the original livestock 3D point cloud;

[0107] Among them, such as Figure 2 As shown, the specific steps of the downsampling part described in this embodiment are as follows:

[0108] E1: Preset the depth of the downsampling layer;

[0109] E2: Use the farthest point sampling algorithm to sample a set of center points from the point cloud data;

[0110] E3: Using the center point of the sampling as the center point of the local neighborhood of the point cloud, the point cloud data is divided into local neighborhoods; the specific sampling process is the same as step D1.

[0111] E4: Extract features from each local neighborhood of the divided point cloud, aggregate the extracted features to the center point, realize downsampling of the point cloud, and output a new set of point clouds;

[0112] E5: Repeat steps E1 to E3 for the output set of new point clouds until the preset downsampling layer depth is reached, and finally output the final downsampled set of point clouds.

[0113] In this embodiment, feature extraction is performed on each of the divided local neighborhoods of the point cloud. The feature extraction of each local neighborhood of the point cloud is as follows:

[0114] F1: Input is N×F 1 The matrix, where N represents the number of points in each local neighborhood, F 1 This represents the feature dimension of each point after feature extraction from the previous layer.

[0115] F2: First, the features are mapped to a high-dimensional space using an MLP (Multilayer Perceptron);

[0116] F3: The KC module is used to calculate the kernel correlation between the kernel points and the local neighborhood points obtained in F2, thus completing the first feature extraction of the local neighborhood of the point cloud;

[0117] F4: The local neighborhood of the point cloud obtained in F3 is further input into three MLPs with different layers, and the output results of the three MLPs are used to construct point clouds to obtain three graph structures G1, G2, and G3.

[0118] F5: The P-GAT module is applied to graph structures G1, G2, and G3 respectively, generating three attention coefficients. Then, a multi-head attention mechanism is used to combine the three attention coefficients. Perform aggregation and output F 2 Dimensional features h′ i The final output is 1×F 2 The matrix, i.e., the high-dimensional features of the center.

[0119] In this embodiment, the multi-head attention mechanism is used to combine the three attention coefficients. The aggregation is performed using the following formula:

[0120]

[0121] In the formula, K takes the value of 3. W is the k-th attention coefficient. k It is the weight matrix of the corresponding input linear transformation, and the final output is the center point feature h′. i The average of the K attention features learned by each neighboring point corresponding to the center point is used to form the value.

[0122] In this embodiment, the point cloud data segmentation network model is divided into a downsampling part and an upsampling part. The specific steps of the upsampling part are as follows:

[0123] First, save the results generated by each intermediate layer during the downsampling process;

[0124] Then, the high-dimensional features extracted by downsampling are interpolated to the high-dimensional features of the previous layer points, and so on, until the original livestock 3D point cloud data is returned.

[0125] Finally, the interpolated 3D point cloud of livestock is passed through a fully connected layer and a softmax layer to score each point cloud, thereby classifying each point cloud, i.e., point cloud segmentation.

[0126] Based on the above embodiments, in this embodiment, firstly, the manually annotated 3D point cloud data of pigs (340 sets of pig body point cloud data) is divided into training set, validation set and test set in a ratio of 8:1:1;

[0127] Then, the training and validation sets are input into the constructed point cloud data segmentation network model for training to achieve parameter tuning. In this embodiment, 150 epochs of training are set. After each iteration, the data on the validation set is tested. In the later stage, the mIoU (mean intersection-over-union ratio) index is stable above 85%, and the OA (overall accuracy) index is stable above 94.5%.

[0128] Finally, the performance of the point cloud data segmentation network model was tested using a test set.

[0129] The specific test results of different algorithms in this embodiment are shown in Table 1.

[0130] Table 1. Segmentation performance of different algorithms on the dataset.

[0131]

[0132] This embodiment uses transfer learning to transfer the trained point cloud data segmentation network model to the second category of livestock 3D point cloud data that has not yet been manually labeled, thereby achieving partial segmentation of the second category of livestock 3D point cloud data, as detailed below:

[0133] First, 150 sets of unlabeled 3D point cloud data of cattle were directly input into the trained point cloud data segmentation network model. Test results showed that the mIoU could reach 78% and the OA could reach 84%. Some segmentation results are visualized as follows: Figure 3 As shown, this demonstrates that the point cloud data segmentation network model is highly robust.

[0134] Then, the 3D point cloud data of 5-6 cattle were labeled, and the previously trained point cloud data segmentation network model was trained again using the labeled 5-6 data. The remaining unlabeled data was then used for testing. The segmentation results showed that the mIoU reached 88.9%, and the OA reached 96.2%. Some segmentation results are visualized as follows: Figure 4 As shown, the expected results have been achieved;

[0135] Finally, comparing the segmentation results of the point cloud data segmentation network model in the two cases, it is shown that the point cloud data segmentation network model has good robustness and can be further extended to other four-limbed animals with similar body size, such as pigs and cattle.

[0136] Figure 4 , Figure 5 The above describes training the network model using 5-6 labeled datasets, followed by testing with the remaining unlabeled cow body point cloud data. The results include partial instance segmentation images. Figure 4 It is Simmental bull point cloud data. Figure 5 It is point cloud data of water buffalo.

[0137] Obviously, the above embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the implementation of the present invention. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of the present invention should be included within the protection scope of the claims of the present invention.

Claims

1. A point cloud segmentation transfer learning method based on kernel attention convolution, characterized in that: The method includes the following steps: Acquire 3D point cloud data of two different types of livestock; This paper designs and proposes KC and P-GAT modules for extracting local neighborhood features from point clouds, and constructs a point cloud data segmentation network model based on the KC and P-GAT modules. The point cloud segmentation network model includes a downsampling part and an upsampling part. First, the downsampling part is used to extract features and downsample the livestock 3D point cloud data, which will generate a set of center points with high-dimensional features. Then, the upsampling part is used to backsample the high-dimensional features of the center points and backsample them to the original livestock 3D point cloud, thus achieving partial segmentation of the original livestock 3D point cloud. The three-dimensional point cloud data of the first category of livestock, which has been manually labeled, is input into the point cloud data segmentation network model for training. The training point cloud data segmentation network model was transferred to the second type of livestock 3D point cloud data that had not yet been manually labeled, using the transfer learning method to achieve partial segmentation of the second type of livestock 3D point cloud data. The implementation methods of the KC module include: D1: Obtain the Local neighborhood of the point cloud centered on ; D2: Initialize a set of M learnable points K as kernel points; D3: Select the distance kernel function To represent the kernel point K and its local neighborhood The relationship between them Let be the three-dimensional coordinates of two points in any three-dimensional space; D4: Substitute the distance kernel function to calculate the distance between kernel point K and the local neighborhood. KC correlation between them; The implementation methods of the P-GAT module include: N1: Obtain from The local neighborhood points of the point cloud centered on the high-dimensional feature are represented as follows: h = , ,in, Indicates the th neighborhood N Features of each point express 3D feature space, express The 3rd dimension in the feature space j Features of each point; N2: For local neighborhood points in the point cloud h Construct a graph structure G, where each neighboring vertex is a vertex in graph G, and the distance from a neighboring vertex to the center vertex is... The relationship is represented by the edges in graph G; N3: Introduce the graph attention mechanism GAT and modify it according to the point cloud data structure to find the neighborhood points. For the center point Attention coefficient ; N4: By attention coefficient Find the center point in the local neighborhood of the point cloud. New features .

2. The point cloud segmentation transfer learning method based on kernel attention convolution according to claim 1, characterized in that: The distance kernel function The specific formula is as follows: In the formula, max() represents the function to find the maximum value; express and The Euclidean distance between them; Let be the three-dimensional coordinates of two points in any three-dimensional space; The kernel width is controlled by the distance between two points.

3. The point cloud segmentation transfer learning method based on kernel attention convolution according to claim 1, characterized in that: The specific calculation process of the nuclear correlation KC is as follows: In the formula, This represents the m-th learnable point in kernel point K; express The neighborhood index set; express One of the neighborhood points; Representing the neighborhood points and the center point Local relationships between them.

4. The point cloud segmentation transfer learning method based on kernel attention convolution according to claim 1, characterized in that: The attention coefficient The calculation formula is as follows: In the formula, For the set of neighborhood point indices; j The index of a certain neighboring point; express and Attention value between two points; Represents the normalization function; in, The expression is as follows: In the formula, This represents the edge relationships between neighboring points and the center point in a graph structure G; Indicates a splicing operation; Represents the weight matrix; Represents a weight vector; Represents a nonlinear function; This indicates transpose.

5. The point cloud segmentation transfer learning method based on kernel attention convolution according to claim 1, characterized in that: The center point within the local neighborhood of the point cloud New features The calculation formula is as follows: In the formula, It is a nonlinear function.

6. The point cloud segmentation transfer learning method based on kernel attention convolution according to claim 1, characterized in that: The specific steps of the downsampling part are as follows: E1: Preset the depth of the downsampling layer; E2: Use the farthest point sampling algorithm to sample a set of center points from the point cloud data; E3: Using the center point of the sampling as the center point of the local neighborhood of the point cloud, the point cloud data is divided into local neighborhoods; E4: Extract features from each local neighborhood of the divided point cloud, aggregate the extracted features to the center point, realize downsampling of the point cloud, and output a new set of point clouds; E5: Repeat steps E1 to E3 for the output set of new point clouds until the preset downsampling layer depth is reached, and finally output the final downsampled set of point clouds.

7. The point cloud segmentation transfer learning method based on kernel attention convolution according to claim 6, characterized in that: The feature extraction for each local neighborhood of the divided point cloud is as follows: F1: Input is N × The matrix, N This represents the number of points within each local neighborhood. This represents the feature dimension of each point after feature extraction from the previous layer. F2: First, the features are mapped to a high-dimensional space using an MLP (Multilayer Perceptron); F3: The KC module is used to calculate the kernel correlation between the kernel points and the local neighborhood points obtained in F2, thus completing the first feature extraction of the local neighborhood of the point cloud; F4: The local neighborhood of the point cloud obtained in F3 is further input into three MLPs with different layers, and the output results of the three MLPs are used to construct point clouds to obtain three graph structures G1, G2, and G3. F5: The P-GAT module is applied to graph structures G1, G2, and G3 respectively, generating three attention coefficients. Then, a multi-head attention mechanism is used to combine the three attention coefficients. Perform aggregation and output. Features of Dimensions The final output is 1× The matrix, i.e., the high-dimensional features of the center.

8. The point cloud segmentation transfer learning method based on kernel attention convolution according to claim 7, characterized in that: The aforementioned multi-head attention mechanism integrates three attention coefficients The aggregation is performed using the following formula: In the formula, K The value is 3. For the k-th attention coefficient, It is the weight matrix of the corresponding input linear transformation, and the final output is the center point feature. The average of the K attention features learned by each neighboring point corresponding to the center point is used to form the value.

9. The point cloud segmentation transfer learning method based on kernel attention convolution according to claim 1, characterized in that: The upsampling process involves the following steps: First, save the results generated by each intermediate layer during the downsampling process; Then, the high-dimensional features extracted by downsampling are interpolated to the high-dimensional features of the previous layer points, and so on, until the original livestock 3D point cloud data is returned. Finally, the interpolated 3D point cloud of livestock is passed through a fully connected layer and a softmax layer to score each point cloud, thereby classifying each point cloud, i.e., point cloud segmentation.