A tree modeling method based on unmanned aerial vehicle image acquisition and image recognition

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using drones to collect multi-angle images and combining Mask R-CNN and COLMAP technologies, the problems of low efficiency and insufficient accuracy in traditional tree resource survey methods have been solved, and high-precision 3D tree modeling has been achieved.

CN119863719BActive Publication Date: 2026-06-23TONGJI UNIV

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: TONGJI UNIV
Filing Date: 2024-12-24
Publication Date: 2026-06-23

Application Information

Patent Timeline

24 Dec 2024

Application

23 Jun 2026

Publication

CN119863719B

IPC: G06V20/17; G06V20/10; G06V20/64; G06V10/26; G06V10/82; G06V10/764; G06N3/0464; G06N3/084; G06N3/045

AI Tagging

Application Domain

Biological models Three-dimensional object recognition

Technology Topics

Pattern recognition Point cloud

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Joint expression coding system and method based on static and dynamic expression images
US12664819B2Character and pattern recognition Pattern recognition Image generation
A high-precision visual displacement measurement space-time combined error correction method
CN122258834ASystematic error suppressionStable Displacement MeasurementImage analysis Character and pattern recognition Pattern recognition Engineering
Apparatus and method for building an object database for training an artificial intelligence model
US20260170810A1Character and pattern recognition Pattern recognition Data set
A 3D human pose estimation method, device and storage medium
CN122244960ACharacter and pattern recognition Biological models Pattern recognition Human body
A three-dimensional gesture tracking method based on an RGB camera
CN115810219BSimple structure improve accuracy Character and pattern recognition Biological models Pattern recognition Computer graphics (images)

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Traditional tree resource survey methods are inefficient and inaccurate, making it difficult to obtain high-quality 3D tree information. Existing image processing methods are insufficient in terms of recognition accuracy and robustness, making it difficult to meet the needs of high-precision tree modeling.

Method used

Multi-angle images were collected using drones, semantic segmentation was performed using the Mask R-CNN model, feature points were extracted using SIFT and RANSAC algorithms, a 3D point cloud model was constructed using COLMAP, and noise reduction and optimization were performed using the SOR filtering algorithm, finally constructing a geometric model of trees.

Benefits of technology

It improves the accuracy of tree segmentation, enhances the robustness of the model to complex scenes, ensures the accuracy and usability of the 3D point cloud model, and generates a high-quality 3D tree mesh model.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN119863719B_ABST

Patent Text Reader

Abstract

The present application relates to a kind of tree modeling methods based on unmanned aerial vehicle collection image and image recognition, comprising: according to the scale of target area and tree distribution planning flight path of unmanned aerial vehicle, collect multi-angle image of tree;Then, through pre-processing, image is denoised, cropped and enhanced, and using pre-trained neural network, image is semantically segmented, and the semantic label of each pixel is obtained;Through SIFT algorithm, image feature points are extracted and matched, and COLMAP incremental reconstruction method is used to construct preliminary three-dimensional point cloud model;According to the semantic segmentation result of image, the semantic label of each pixel is transmitted to point cloud data, and the semantic labeling of point cloud is completed;Again, by traversing point cloud data, according to label, point cloud is segmented, and three-dimensional point cloud model of tree is obtained;The method can be widely applied in vegetation monitoring, forest resource investigation and ecological environment protection and other fields, with the advantages of high precision, automation and high efficiency.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of image 3D reconstruction technology, specifically relating to a tree modeling method based on images acquired by UAVs and image recognition. Background Technology

[0002] Currently, traditional tree resource survey methods mainly rely on manual ground measurement and recording, which has many limitations. Manual ground measurement is inefficient, especially when surveying large areas of forests or green spaces, requiring significant time and manpower. Measuring each tree requires precise location and multiple data recordings, a tedious and repetitive process that severely impacts survey efficiency. Furthermore, manual measurement is labor-intensive, and its accuracy is easily affected by human factors. Improper operation, inappropriate use of measuring equipment, and differences in the experience of surveyors can all lead to inaccurate data, thus affecting the final survey results.

[0003] Traditional manual measurement methods are also insufficient for effectively acquiring spatial data on trees. For example, traditional methods cannot quickly and accurately obtain three-dimensional information such as the spatial distribution and branch morphology of trees, which places higher demands on modern ecological monitoring, forest resource management, and related applications. Therefore, it is evident that traditional tree resource survey methods have certain limitations in terms of efficiency, accuracy, and spatial data acquisition capabilities.

[0004] With the rapid development of drone technology, drones equipped with high-precision cameras have provided new solutions for tree resource surveys. Drones can cover large areas in a short time, and the data they collect is highly accurate, enabling efficient collection of tree information in complex terrains and environments. Through different flight plans, drones can quickly capture images of trees from various angles, providing more perspectives and data support for 3D tree modeling. Because drones can easily fly over densely wooded areas and avoid areas difficult to reach by manual surveying, work efficiency can be greatly improved.

[0005] While drone technology enables large-scale and efficient tree image acquisition, extracting effective tree information from massive amounts of image data and transforming it into accurate 3D models remains a major technological challenge. Traditional image processing methods typically rely on manual image region segmentation or coarse visual algorithms for point cloud reconstruction. These methods suffer from low accuracy, poor algorithm adaptability, and high processing complexity. For example, manual image region segmentation is not only inefficient but also susceptible to human factors, leading to insufficient accuracy in tree information extraction. While automatic point cloud reconstruction methods based on visual algorithms can improve automation to some extent, they also face issues with insufficient recognition accuracy and robustness. In particular, the complex shapes of trees, occlusion phenomena, and similarity to the background make it difficult for traditional visual algorithms to accurately extract tree information, resulting in poor-quality point clouds and unsatisfactory reconstruction effects.

[0006] Existing image-based tree modeling methods, while improving the efficiency of tree information acquisition to some extent, still suffer from problems such as low recognition accuracy, poor point cloud quality, and limited reconstruction results. These issues limit the effectiveness of existing technologies in practical applications, especially in scenarios requiring high-precision tree modeling, where traditional methods often fall short of the requirements. Summary of the Invention

[0007] The purpose of this invention is to overcome the shortcomings of the existing technology and provide a tree modeling method based on UAV image acquisition and image recognition.

[0008] The objective of this invention can be achieved through the following technical solutions:

[0009] This invention provides a tree modeling method based on images acquired by unmanned aerial vehicles (UAVs) and image recognition, comprising the following steps:

[0010] Step S1: Plan the drone's flight mode and flight path based on the size, shape, and tree distribution of the target area;

[0011] Step S2: Using the planned flight mode and flight path, the drone flies over the target area to collect multi-angle image data of trees. The multi-angle image data includes a set of tree images taken from multiple different viewpoints and heights.

[0012] Step S3: Preprocess the multi-angle image data, including denoising, cropping, image enhancement, and standardization;

[0013] Step S4: Input the preprocessed multi-angle image data into the pre-trained neural network model for tree recognition and semantic segmentation, and output the semantic segmentation map of each tree image, wherein each pixel in the semantic segmentation map is assigned a semantic label.

[0014] Step S5: Use the SIFT algorithm to extract feature points and their descriptors from each tree image, use the nearest neighbor matching algorithm to match feature points between multiple tree images based on the feature points and their descriptors of each tree image, and use the RANSAC algorithm to remove mismatched feature points.

[0015] Step S6: Based on the matched feature points, construct a preliminary 3D point cloud model using COLMAP through an incremental reconstruction method;

[0016] Step S7: Calculate the 3D coordinates of each pixel in the semantic segmentation map obtained in step S4 in the preliminary 3D point cloud model, determine the point cloud data corresponding to each pixel based on the 3D coordinates of each pixel, and pass the semantic label of each pixel to the corresponding point cloud data.

[0017] Step S8: Traverse each point cloud data in the preliminary 3D point cloud model, segment the preliminary 3D point cloud model according to the semantic tags of the point cloud data, and obtain the tree 3D point cloud model.

[0018] Step S9: Use the SOR filtering algorithm to denoise and optimize the 3D point cloud model of the tree. Obtain the 3D mesh model of the tree by using the Poisson reconstruction method after denoising and optimization. Construct the geometric model of the tree based on the 3D mesh model of the tree.

[0019] Furthermore, the semantic tags include tree trunk, leaves, and background.

[0020] Furthermore, the neural network model is a Mask R-CNN model, and the training process of the Mask R-CNN model includes the following steps:

[0021] Acquire a collection of tree images taken from multiple different perspectives and heights, and preprocess the tree image collection, including denoising, cropping, image enhancement, and normalization;

[0022] The image labeling tool LabelMe was used to manually label each region in the preprocessed tree image set. The parts of the tree trunk that were covered by leaves were labeled as leaves. Each pixel in the tree image was assigned a label and a corresponding segmentation mask was generated. The labeled tree image was then converted into a binary mask image.

[0023] The preprocessed set of tree images and the binary mask image are input into the Mask R-CNN model for training. The output is a binary mask image of the tree images. Based on the output binary mask image and the labeled binary mask image, the loss of each pixel is calculated using a loss function. The calculated total loss value is used to update the model parameters through the backpropagation algorithm.

[0024] Furthermore, the loss function of the Mask R-CNN model is:

[0025]

[0026] Where L is the loss function of the Mask R-CNN model, W is the loss weight matrix of the Mask R-CNN model, representing the loss weight between each predicted semantic label and the real semantic label, p is the p-th pixel of the tree image, and y(p) is the real semantic label of the p-th pixel. The predicted semantic label for the p-th pixel output by the Mask R-CNN model. Let N be the probability that the p-th pixel belongs to the predicted semantic label for the Mask R-CNN model, where N is the total number of pixels in the tree image, with background label being 0, trunk label being 1, and leaf label being 2.

[0027] Furthermore, the loss weight matrix W of the Mask R-CNN model is:

[0028]

[0029] In this context, rows represent true semantic labels, columns represent predicted semantic labels, and values represent the loss weights between these semantic labels.

[0030] Furthermore, the step of using the nearest neighbor matching algorithm to match feature points among multiple tree images based on the local feature points of each tree image includes the following steps:

[0031] Feature points and descriptors are extracted from each image. The feature points are identified by the Scale Invariant Feature Transform (SIFT) algorithm, and the descriptors are 128-dimensional vectors for each feature point.

[0032] For every two tree images, descriptors are matched, the Euclidean distance between each pair of descriptors is calculated, and each feature point is matched based on the nearest Euclidean distance to obtain matching point pairs.

[0033] Furthermore, the preliminary 3D point cloud model is constructed using COLMAP through an incremental reconstruction method based on matched feature points, including the following steps:

[0034] Step A1: Calculate the relative pose between images based on the matched feature point pairs;

[0035] Step A2: Based on the relative pose and matching feature point pairs, calculate the preliminary 3D point cloud through triangulation;

[0036] Step A3: Add at least one image to the reconstruction process, and calculate the relative pose of the new image using the matching feature point pairs between the new image and the existing 3D point cloud.

[0037] Step A4: Based on the pose and matching feature points of the new image, update the 3D point cloud through triangulation and perform biceps adjustment to optimize the 3D point cloud and camera pose;

[0038] Step A5: Repeat steps A3 and A4 to gradually update and optimize the 3D point cloud and generate a preliminary 3D point cloud model.

[0039] Furthermore, step S7 includes the following steps:

[0040] Based on the two-dimensional coordinates (u,v) of each pixel in the semantic segmentation map and the camera intrinsic and extrinsic parameters, back projection is performed using a projection matrix to calculate the three-dimensional coordinates of that pixel in the preliminary three-dimensional point cloud model.

[0041]

[0042] Where (X,Y,Z) are the 3D coordinates of the pixel in the preliminary 3D point cloud model, Z(u,v) are the depth information of the pixel, and K is the intrinsic parameter matrix of the camera. The intrinsic parameter matrix K of the camera is:

[0043]

[0044] Among them, f x f y Let c be the camera's focal length, and c be the scaling factors along the x and y axes, respectively. x c y The coordinates of the camera's principal point;

[0045] Find the point cloud data point that is closest to the three-dimensional coordinates of the pixel in the preliminary three-dimensional point cloud model;

[0046] The semantic label of a pixel is passed to the nearest point cloud data point.

[0047] Furthermore, the step of traversing each point cloud data in the preliminary 3D point cloud model and segmenting the preliminary 3D point cloud model according to the semantic tags of the point cloud data to obtain the tree 3D point cloud model includes the following steps:

[0048] Traverse the point cloud data without semantic labels in the initial 3D point cloud model, calculate the point cloud data that is closest to it, and set the semantic label of the point cloud data that is closest to it as the semantic label of the point cloud data that is currently being traversed without semantic labels.

[0049] Traverse each point cloud data in the initial 3D point cloud model, delete point cloud data with semantic labels as background, and obtain the 3D point cloud model of the tree.

[0050] Furthermore, the noise reduction and optimization of the 3D point cloud model of trees using the SOR filtering algorithm includes the following steps:

[0051] For each point cloud data p in the 3D point cloud model of the tree i Define its neighborhood N(p) i The neighborhood N(p) i ) represents point cloud data p i Other point cloud data sets within a certain radius;

[0052] Calculate each point cloud data p i All points p in the neighborhood of j With point cloud data p i Euclidean distance d(p) i ,p j The formula is:

[0053]

[0054] Among them, (X) i ,Y i Z i ) and (X j ,Y j Z j () are point cloud data p i With point cloud data p j Three-dimensional coordinates in a 3D point cloud model of a tree;

[0055] Calculate each point p i The mean distance μ(p) of the neighborhood points i ) and standard deviation σ(p i The formula is:

[0056]

[0057] Where, |N(p i )| For point cloud data p i The number of points in the neighborhood;

[0058] Based on each point cloud data p i The mean distance μ(p) i ) and standard deviation σ(p i ), determine the point cloud data p i Whether a point cloud data point is an outlier is determined by the following conditions: i Outlier:

[0059] d(p i ,p j )>μ(p i )+θ·σ(p i )

[0060] Where θ is a preset threshold factor;

[0061] Point cloud data identified as outliers are removed from the 3D point cloud model of the tree to obtain a noise-reduced and optimized 3D point cloud model of the tree.

[0062] Compared with the prior art, the present invention has the following advantages:

[0063] (1) This invention uses a drone to collect multi-angle images of trees in a target area and uses a pre-trained Mask R-CNN model to perform semantic segmentation of the trees, so that each pixel is accurately assigned a semantic label (such as trunk, leaves and background). This method can effectively handle complex scenes, including situations where leaves cover the trunk, and improves the accuracy of tree segmentation.

[0064] (2) The Mask R-CNN model loss function of this invention effectively solves the problem of class imbalance in images through a weighted cross-entropy loss function. The loss weight matrix W in the loss function is set empirically (e.g., background class is 0, tree trunk class is 5, and leaf class is 10). By assigning different loss weights to different classes (e.g., tree trunk, leaves, and background), especially assigning larger weights to tree trunk and leaf classes, the model can effectively enhance its attention to these fewer classes, avoid the impact of class imbalance on model performance, and improve the segmentation accuracy of smaller classes. In addition, the weight matrix in the loss function is set empirically, which allows the model to adaptively adjust the contribution of each class, optimize gradient propagation during training, and improve the model's learning ability in difficult regions (e.g., overlapping or occluded regions of leaves and tree trunks), thereby enhancing the model's robustness to complex scenes and improving the final semantic segmentation accuracy.

[0065] (3) This invention uses the SIFT algorithm to extract feature points and removes mismatched points through the nearest neighbor matching algorithm and the RANSAC algorithm, ensuring the accuracy of point cloud data during the 3D reconstruction process. Combined with the COLMAP incremental reconstruction method, a preliminary 3D point cloud model is gradually generated, and the camera pose is optimized, improving the accuracy and stability of the modeling process.

[0066] (4) This invention accurately transmits semantic labels to the point cloud data by connecting the three-dimensional coordinates of each pixel in the semantic segmentation map with the preliminary three-dimensional point cloud model. Subsequently, a traversal method is used to segment the point cloud according to the semantic labels, ultimately obtaining a three-dimensional point cloud model of trees. This process ensures the integrity of the semantic information of each point cloud data in the final model, enhancing the usability and accuracy of the model.

[0067] (5) This invention uses a statistical outlier removal (SOR) filtering algorithm to denoise 3D point clouds, which can effectively remove noisy points and optimize the quality of point cloud data. By calculating the Euclidean distance of the neighborhood points of each point cloud data, outliers are identified and removed, which significantly improves the density and quality of point cloud data and ensures that the generated 3D point cloud model is smoother and more realistic. Attached Figure Description

[0068] Figure 1 This is a flowchart of the method of the present invention;

[0069] Figure 2 This is a flowchart of the present invention. Detailed Implementation

[0070] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.

[0071] Example 1:

[0072] This embodiment provides a tree modeling method based on images acquired by a drone and image recognition, such as... Figure 1 As shown, it includes the following steps:

[0073] Step S1: Plan the drone's flight mode and flight path based on the size, shape, and tree distribution of the target area;

[0074] Step S2: Using the planned flight mode and flight path, the drone flies over the target area to collect multi-angle image data of trees. The multi-angle image data includes a collection of tree images taken from multiple different viewpoints and heights.

[0075] Step S3: Preprocess the multi-angle image data, including denoising, cropping, image enhancement, and standardization;

[0076] Step S4: Input the preprocessed multi-angle image data into the pre-trained neural network model for tree recognition and semantic segmentation, and output the semantic segmentation map of each tree image. Each pixel in the semantic segmentation map is assigned a semantic label.

[0077] Step S5: Use the SIFT algorithm to extract feature points and their descriptors from each tree image, use the nearest neighbor matching algorithm to match feature points between multiple tree images based on the feature points and their descriptors of each tree image, and use the RANSAC algorithm to remove mismatched feature points.

[0078] Step S6: Based on the matched feature points, construct a preliminary 3D point cloud model using COLMAP through an incremental reconstruction method;

[0079] Step S7: Calculate the 3D coordinates of each pixel in the semantic segmentation map obtained in step S4 in the preliminary 3D point cloud model, determine the point cloud data corresponding to each pixel based on the 3D coordinates of each pixel, and pass the semantic label of each pixel to the corresponding point cloud data.

[0080] Step S8: Traverse each point cloud data in the preliminary 3D point cloud model, segment the preliminary 3D point cloud model according to the semantic tags of the point cloud data, and obtain the tree 3D point cloud model.

[0081] Step S9: Use the SOR filtering algorithm to denoise and optimize the 3D point cloud model of the tree. Obtain the 3D mesh model of the tree by using the Poisson reconstruction method after denoising and optimization. Construct the geometric model of the tree based on the 3D mesh model of the tree.

[0082] Furthermore, semantic tags include tree trunk, leaves, and background.

[0083] Furthermore, the neural network model is the Mask R-CNN model, and the training process of the Mask R-CNN model includes the following steps:

[0084] Acquire a collection of tree images taken from multiple different perspectives and heights, and preprocess the tree image collection, including denoising, cropping, image enhancement, and normalization;

[0085] The image labeling tool LabelMe was used to manually label each region in the preprocessed tree image set. The parts of the tree trunk that were covered by leaves were labeled as leaves. Each pixel in the tree image was assigned a label and a corresponding segmentation mask was generated. The labeled tree image was then converted into a binary mask image.

[0086] The preprocessed set of tree images and the binary mask image are input into the Mask R-CNN model for training. The output is a binary mask image of the tree images. Based on the output binary mask image and the labeled binary mask image, the loss of each pixel is calculated using a loss function. The calculated total loss value is used to update the model parameters through the backpropagation algorithm.

[0087] Furthermore, the loss function of the Mask R-CNN model is:

[0088]

[0089] Where L is the loss function of the Mask R-CNN model, W is the loss weight matrix of the Mask R-CNN model, representing the loss weight between each predicted semantic label and the real semantic label, p is the p-th pixel of the tree image, and y(p) is the real semantic label of the p-th pixel. The predicted semantic label for the p-th pixel output by the Mask R-CNN model. Let N be the probability that the p-th pixel belongs to the predicted semantic label for the Mask R-CNN model, where N is the total number of pixels in the tree image, with background label being 0, trunk label being 1, and leaf label being 2.

[0090] Furthermore, weights are defined for the errors between different categories:

[0091] The predicted semantic label is background - the true semantic label is the trunk: the maximum loss is set to a large weight (e.g., 10);

[0092] The predicted semantic label is background - the true semantic label is leaves: a small loss, set with a medium weight (e.g., 5);

[0093] The predicted semantic labels are the trunk and the true semantic labels are the background: the maximum loss is set to a large weight (e.g., 10).

[0094] Predicted semantic label is trunk - true semantic label is leaf: small loss, set with medium weight (e.g., 5);

[0095] The predicted semantic label is "leaves" and the true semantic label is "background": a smaller loss is set to a medium weight (e.g., 5).

[0096] Predicted semantic label is leaf - true semantic label is trunk: small loss, set with medium weight (e.g., 5);

[0097] A weight matrix W is defined for the prediction error between each category as follows:

[0098]

[0099] In this context, rows represent true semantic labels, columns represent predicted semantic labels, and values represent the loss weights between these semantic labels.

[0100] Furthermore, the nearest neighbor matching algorithm is used to match feature points among multiple tree images based on the local feature points of each tree image, including the following steps:

[0101] Feature points and descriptors are extracted from each image. Feature points are identified by the Scale Invariant Feature Transform (SIFT) algorithm, and descriptors are 128-dimensional vectors for each feature point.

[0102] For every two tree images, descriptors are matched, the Euclidean distance between each pair of descriptors is calculated, and each feature point is matched based on the nearest Euclidean distance to obtain matching point pairs.

[0103] Furthermore, based on the matched feature points, a preliminary 3D point cloud model is constructed using COLMAP through an incremental reconstruction method, including the following steps:

[0104] Step A1: Calculate the relative pose between images based on the matched feature point pairs;

[0105] Step A2: Based on relative pose and matching feature point pairs, calculate the preliminary 3D point cloud through triangulation;

[0106] Step A3: Add at least one image to the reconstruction process, and calculate the relative pose of the new image using the matching feature point pairs between the new image and the existing 3D point cloud.

[0107] Step A4: Based on the pose and matching feature points of the new image, update the 3D point cloud through triangulation and perform biceps adjustment to optimize the 3D point cloud and camera pose;

[0108] Step A5: Repeat steps A3 and A4 to gradually update and optimize the 3D point cloud and generate a preliminary 3D point cloud model.

[0109] Furthermore, step S7 includes the following steps:

[0110] Based on the two-dimensional coordinates (u,v) of each pixel in the semantic segmentation map and the camera intrinsic and extrinsic parameters, back projection is performed using a projection matrix to calculate the three-dimensional coordinates of that pixel in the preliminary three-dimensional point cloud model.

[0111]

[0112] Where (X,Y,Z) are the 3D coordinates of the pixel in the initial 3D point cloud model, Z(u,v) are the depth information of the pixel, and K is the intrinsic parameter matrix of the camera. The intrinsic parameter matrix K of the camera is:

[0113]

[0114] Among them, f x f y Let c be the camera's focal length, and c be the scaling factors along the x and y axes, respectively. x c y The coordinates of the camera's principal point;

[0115] Find the point cloud data point that is closest to the three-dimensional coordinates of the pixel in the preliminary three-dimensional point cloud model;

[0116] The semantic label of a pixel is passed to the nearest point cloud data point.

[0117] Furthermore, by traversing each point cloud data point in the preliminary 3D point cloud model and segmenting the preliminary 3D point cloud model based on the semantic labels of the point cloud data, a 3D point cloud model of the tree is obtained, including the following steps:

[0118] Traverse the point cloud data without semantic labels in the initial 3D point cloud model, calculate the point cloud data that is closest to it, and set the semantic label of the point cloud data that is closest to it as the semantic label of the point cloud data that is currently being traversed without semantic labels.

[0119] Traverse each point cloud data in the initial 3D point cloud model, delete point cloud data with semantic labels as background, and obtain the 3D point cloud model of the tree.

[0120] Furthermore, the SOR filtering algorithm is used to optimize the noise reduction of the 3D point cloud model of trees, including the following steps:

[0121] For each point cloud data p in the 3D point cloud model of the tree i Define its neighborhood N(p) i ), neighborhood N(p i ) represents point cloud data p i Other point cloud data sets within a certain radius;

[0122] Calculate each point cloud data p i All points p in the neighborhood of j With point cloud data p i Euclidean distance d(p) i ,p j The formula is:

[0123]

[0124] Among them, (X) i ,Y i Z i ) and (X j ,Y j Z j () are point cloud data p i With point cloud data p j Three-dimensional coordinates in a 3D point cloud model of a tree;

[0125] Calculate each point p i The mean distance μ(p) of the neighborhood points i ) and standard deviation σ(p i The formula is:

[0126]

[0127] Among them, |B(p i)| For point cloud data p i The number of points in the neighborhood;

[0128] Based on each point cloud data p i The mean distance μ(p) i ) and standard deviation σ(p i ), determine the point cloud data p i Whether a point cloud data point is an outlier is determined by the following conditions: i Outlier:

[0129] d(p i ,p j )>μ(p i )+θ·σ(p i )

[0130] Where θ is a preset threshold factor;

[0131] Point cloud data identified as outliers are removed from the 3D point cloud model of the tree to obtain a noise-reduced and optimized 3D point cloud model of the tree.

[0132] Example 2:

[0133] The parts not mentioned in this embodiment are the same as in Embodiment 1.

[0134] This embodiment provides a tree point cloud modeling method based on images acquired and identified by a drone, such as... Figure 2 As shown, it includes the following steps:

[0135] I. Images acquired by drones

[0136] In the process of tree point cloud modeling, it is necessary to use drones to collect high-quality images to ensure that the images cover the entire target area and have sufficient detail to support subsequent 3D reconstruction.

[0137] First, flight planning is crucial. The drone's flight path should be designed based on the size, shape, and tree distribution of the target area. Common flight modes include grid, straight-line, and circular modes. For example, in a large area, grid mode ensures that trees are photographed from multiple angles, avoiding missed shots. In a narrower area, straight-line mode is more efficient. To ensure sufficient overlap between images, the flight path design should aim for an overlap rate of 60%-80% between the camera's shooting areas. Too low an overlap may lead to insufficient feature point matching, affecting reconstruction results; while too high an overlap may increase the computational burden. Therefore, adjustments should be made reasonably based on specific circumstances.

[0138] Next, set up the camera. The camera on the drone should have sufficient resolution and a wide field of view; a camera with 12MP or higher is generally recommended. To ensure image detail, choose a high resolution and set the camera's exposure parameters appropriately. Automatic exposure mode is suitable for scenes with dynamic lighting conditions, but manual exposure settings will provide more stability if the light changes significantly. Furthermore, maintaining consistent white balance and avoiding color casts will improve the accuracy of subsequent feature extraction.

[0139] During the shooting process, the overlap and angle of the images need to be carefully controlled. It is recommended that the front-to-back overlap of each image be set between 70% and 80%, and the side overlap between 50% and 60%. This can effectively ensure the matching of feature points between images, thereby improving the accuracy of 3D reconstruction. In addition, the diversity of shooting angles is also very important. Avoid shooting only from a vertical perspective; appropriately tilting the camera angle can ensure that key parts such as the tree trunk and crown are captured well.

[0140] During flight, image quality should be checked in real time. If possible, the captured images should be checked periodically to ensure they are free from problems such as blurriness or uneven exposure. The shooting frequency is usually set to 1-2 images per second, which ensures sufficient image coverage while avoiding storage overload.

[0141] II. Semantic Extraction Using Neural Networks

[0142] In tree point cloud modeling, semantic extraction is a crucial step in classifying and segmenting image content, aiming to identify trees and their related parts (such as trunks and leaves) from the original image. Neural networks, especially models for image segmentation, such as Mask R-CNN, play a vital role in this task. By training these models, efficient semantic segmentation can be achieved, extracting different parts of the tree and providing accurate region information for subsequent point cloud reconstruction and model processing.

[0143] The following is a detailed procedure for this step:

[0144] 1. Data Preprocessing

[0145] The effectiveness of neural network training depends on high-quality training data; therefore, image preprocessing is a crucial step. The main preprocessing steps include denoising, cropping, and standardization.

[0146] ① Denoising: The original image may contain noise, which can affect the training effect of the neural network. Common denoising methods include median filtering and Gaussian filtering, which reduce the impact of noise by smoothing the image.

[0147] ② Cropping: Based on the size and location of the trees, crop the region of the image that contains all parts of the trees. Cropping the image can reduce the computational load during model training and improve training efficiency.

[0148] ③ Image Augmentation: Data augmentation techniques are very useful when training data is insufficient. The training dataset can be expanded through rotation, flipping, cropping, and color adjustments, making the model more robust and better able to adapt to images in different environments.

[0149] ④ Standardization: Standardize the image to ensure that the pixel values are distributed within a certain range (e.g., normalize the RGB values to [0,1] or [-1,1]), which helps to accelerate the convergence of the network.

[0150] 2. Labeling data

[0151] Before training, labeled data needs to be prepared. The labeling process involves collecting an image dataset containing objects such as tree trunks and leaves, and manually labeling each region in the images using the image labeling tool LabelMe. Leaves occluding tree trunks are still labeled as leaves. A label is assigned to each object, and a corresponding segmentation mask is generated. The labeled information is then converted into the format required by the neural network, typically a binary mask image (label for each pixel) or a polygonal region (representing the outline of the object).

[0152] 3. Training the neural network

[0153] The goal of training a neural network is to learn how to classify pixels in an image into different semantic regions, such as trees and background, using existing labeled data. A commonly used neural network model is Mask R-CNN.

[0154] Mask R-CNN is an instance segmentation model based on Faster R-CNN that can perform object detection and semantic segmentation simultaneously, making it particularly suitable for problems requiring the detection and segmentation of different object instances (such as tree trunks and leaves). Unlike traditional image segmentation, Mask R-CNN generates an accurate segmentation mask for each object while detecting it. Mask R-CNN requires labeling tree trunks, leaves, etc., in each image to generate corresponding segmentation masks. During training, the model optimizes both object detection and segmentation tasks simultaneously, resulting in more refined segmentation results.

[0155] 4. Results Output

[0156] The trained model will infer from the input image and output a category label for each pixel, i.e., a semantic segmentation map of the image. In the application of tree point cloud modeling, each pixel in the semantic segmentation map will be classified into the following three categories:

[0157] Tree trunk: The tree trunk includes the main trunk and main branches. The main branches are the large branches that grow from the trunk and support more secondary branches.

[0158] Leaves: The leafy part of a tree. Leaves are usually located at the top of the tree or at the ends of branches, forming the tree's crown.

[0159] Background: This includes open areas, other objects, etc., and is usually labeled as background.

[0160] The output segmentation map can help clarify the structure of each region, providing necessary semantic information for subsequent segmentation processing after point cloud generation.

[0161] III. Image-based point cloud generation model:

[0162] Using COLMAP for 3D reconstruction based on UAV-acquired images is a key step in point cloud modeling. COLMAP is a powerful open-source image reconstruction software widely used to generate sparse and dense 3D point cloud models from images. COLMAP combines information from multiple images through methods such as feature matching, relative pose estimation, and incremental reconstruction to generate high-quality 3D reconstruction results.

[0163] 1. Feature matching

[0164] The first step in 3D reconstruction is to extract feature points from the image that can be used for matching. The SIFT (Scale Invariant Feature Transform) algorithm is widely used in this step because it can stably identify key points in the image at different scales and rotations. The specific process is as follows:

[0165] ① Feature Extraction: First, local feature points are extracted from each image using the SIFT algorithm. These feature points have unique descriptors in different images and can be used for matching between different images.

[0166] ② Feature matching: In COLMAP, nearest neighbor matching algorithms (such as brute-force matching or fast matching methods based on KD trees) are typically used to find identical or similar feature points between images. The brute-force matching method calculates the distance between each pair of feature points and selects the smallest distance as the matching result; while KD tree matching quickly finds the nearest neighbor feature points by constructing a spatial index of feature points.

[0167] ③ Matching optimization: In order to improve the accuracy of matching, COLMAP can use algorithms such as RANSAC (random sample consensus) to eliminate mismatches and ensure that the matching of feature points is more accurate.

[0168] 2. Relative pose calculation

[0169] COLMAP can estimate the relative pose between cameras by matching feature points, i.e., calculating the camera's rotation matrix and translation vector. This process is typically achieved through two main methods:

[0170] Five-point algorithm: The five-point algorithm is a commonly used algorithm for estimating the relative pose between cameras. It calculates the relative rotation and translation between cameras based on five pairs of matching points. By solving this five-point problem, the relative pose of the cameras can be obtained. Since there are multiple solutions in the calculation process, RANSAC is usually used to eliminate incorrect matching points to ensure the accuracy of the estimated pose.

[0171] Nonlinear optimization: After the initial pose estimation, COLMAP further refines the relative poses between cameras through nonlinear optimization (such as minimizing reprojection error). The nonlinear optimization algorithm iteratively minimizes the reprojection error of each feature point, making the projected positions of all matching points as close as possible to their true spatial positions.

[0172] The accuracy of relative pose calculation directly affects the accuracy of subsequent 3D reconstruction. Therefore, optimization steps are crucial for improving reconstruction results.

[0173] 3. Incremental Reconstruction

[0174] Incremental reconstruction is one of the core technologies of COLMAP for generating point cloud models. This method typically starts with an initial relative pose estimation, and then gradually adds more images and continuously optimizes the model.

[0175] ① Selection of initial image pairs: First, COLMAP selects a pair of images with many matching feature points from the image set as the initial image pair for 3D reconstruction. The relative pose of this pair of images is usually estimated using the five-point algorithm and optimization steps described above.

[0176] ② Incremental Image Addition: After obtaining the initial 3D model, COLMAP gradually adds new images, attempting to match the feature points of the new images with those in existing images. As more images are added, the 3D reconstruction gradually expands, and relative pose optimization is performed each time a new image is added.

[0177] ③ Optimization and Adjustment: With each new image, COLMAP continuously optimizes the camera pose and point positions. This process includes refining camera parameters and 3D point cloud data through bundle adjustment. Bundle adjustment effectively reduces deviations caused by measurement errors, thereby improving the overall quality of the point cloud.

[0178] Incremental reconstruction can progressively expand the 3D model and ensure that the results at each step are optimized, thereby generating high-quality point clouds.

[0179] 4. Point Cloud Generation

[0180] Through the steps described above, COLMAP will output a preliminary sparse point cloud, which serves as the foundational data for 3D reconstruction. A point cloud is a collection of coordinates of matching feature points in an image within a 3D space. While the initially generated point cloud may contain some noise and inaccuracies, it provides the basis for further 3D reconstruction and optimization.

[0181] ① Sparse point cloud: In the initial stage, the generated point cloud is a sparse point cloud, that is, it contains a small number of high-quality feature points. These points are mostly unique feature points identified from the image, and they form a preliminary model framework in three-dimensional space.

[0182] ② Point cloud quality optimization: As more images are added and optimized, the point cloud gradually increases, and the connections between points become more refined, eventually forming a relatively complete 3D point cloud structure. Through continuous incremental reconstruction and optimization, COLMAP gradually improves the accuracy and detail of the generated point cloud.

[0183] ③ Sparse to Dense Transformation: The initial sparse point cloud can serve as a foundation, which can be further refined using the dense reconstruction algorithm in COLMAP. Dense reconstruction will perform a more refined 3D reconstruction of each pixel, generating a denser and more accurate point cloud.

[0184] IV. Point cloud model processing:

[0185] After generating a point cloud model, further processing steps are usually required to ensure the quality of the point cloud data and remove unwanted noise. This is especially crucial in tree modeling, where extracting leaves and trunks from the raw point cloud and removing environmental debris and noise is essential. The following are two key steps in point cloud data processing: extracting tree parts from the point cloud and denoising using the SOR filtering algorithm.

[0186] 1. Extracting tree parts from point clouds

[0187] (1) Image segmentation results and point cloud data registration

[0188] Point cloud data was generated using COLMAP, and this point cloud data is related to the image data through the camera's viewpoint. Therefore, it is necessary to register the image segmentation results with the point cloud data. The projection matrix for each image can be calculated using the camera's intrinsic and extrinsic parameters. The projection matrix projects points in three-dimensional space onto a two-dimensional image plane. The projection matrix is typically: P = K[R|t], where K is the camera's intrinsic parameter matrix, and [R|t] is the rotation matrix and translation vector provided by the camera's extrinsic parameters.

[0189] The pixel's position in 3D space is calculated using back projection. By combining the pixel's 2D coordinates (u, v) with the camera's intrinsic parameter matrix K, the corresponding ray direction can be obtained. This represents the direction of the ray containing the image pixel (u,v) in three-dimensional space. By combining this with the depth information in the point cloud (generated from COLMAP), if there is a point in the point cloud that projects to the same location as this pixel, the three-dimensional coordinates of that pixel can be determined.

[0190] (2) Map the image segmentation labels to the point cloud

[0191] In image segmentation, each pixel is assigned a label (e.g., "leaf," "trunk," "background"). These labels are then back-projected into the point cloud using the previously obtained projection matrix, ensuring that pixels belonging to the same category in the image also have their corresponding labels in the point cloud. For each image pixel, its 3D coordinates are calculated using the projection matrix to find its corresponding point in the 3D point cloud, and then the pixel's label is passed to that point. Ultimately, each point in the point cloud will contain a label indicating whether it belongs to the category of trunk, leaf, or background, etc.

[0192] (3) Semantic segmentation of point cloud data

[0193] Iterate through each point in the point cloud data, checking the label of each point. If the label of a point is "tree", add the point to the tree region. If the label of a point is "leaf", add the point to the leaf region.

[0194] 2. Point cloud denoising and optimization using the SOR filtering algorithm.

[0195] During the point cloud data generation process, noise is often generated due to various factors (such as shooting angle, object occlusion, etc.). In order to improve the quality of point clouds, the SOR (Statistical Outlier Removal) filtering algorithm is needed to filter out outliers and remove unreliable data points.

[0196] The SOR filtering algorithm is a statistical point cloud denoising method. It determines whether a point is an outlier by calculating the distribution of points in its neighborhood. The specific steps are as follows:

[0197] ① Calculate the neighborhood of each point: For each point in the point cloud, first define its neighborhood, which is the other points within a certain radius around the point.

[0198] ② Calculate statistics: For each point, calculate the distances between all points in its neighborhood. These distances are typically measured using Euclidean distance. Then, calculate the mean and standard deviation of these distances.

[0199]

[0200] Wherein d(p i ,p j () is point p i and point p j The Euclidean distance between them It is the average distance between neighboring points. It is the standard deviation of the distance between neighboring points.

[0201] ③ Outlier Identification: Then, based on the statistical information of the distances to the neighboring points of each point, it is determined whether the point is an outlier. Typically, a distance threshold θ is set (such as the distance to the mean plus twice the standard deviation). If the distance between a point and its neighboring points is far from the mean, the point is considered an outlier.

[0202]

[0203] If the distance of a point exceeds a preset threshold, the point is considered an outlier.

[0204] ④ Remove outliers: Points identified as outliers are removed from the point cloud data, leaving only valid and true point cloud data.

[0205] ⑤ Update point cloud: Remove outliers to obtain denoised point cloud data.

[0206] V. Generating a Mesh Model from a Point Cloud Model

[0207] The purpose of generating a mesh model is to transform sparse point cloud data into a continuous 3D surface model, enabling the model to be used in 3D visualization and subsequent analysis. The Poisson reconstruction algorithm is a commonly used method for this process, which generates a smooth, watertight, and continuous mesh by calculating the divergence of the normal vector field.

[0208] 1. Poisson Reconstruction Method

[0209] The Poisson reconstruction algorithm is based on the mathematical theory of the Poisson equation. Its goal is to derive a smooth 3D surface from known point cloud data (including point positions and normal vectors). The basic steps are as follows:

[0210] ①Estimation of normal vector

[0211] If the point cloud data does not directly provide normal vectors, the normal vector for each point needs to be estimated first. A common approach is to use the K-Nearest Neighbors (KNN) algorithm to find the set of neighboring points for each point, and then use these neighboring points to fit the normal vector through Principal Component Analysis (PCA).

[0212] K-Nearest Neighbors (KNN): For each point, find its k nearest neighbors and use these points for local fitting.

[0213] PCA method: Perform PCA analysis on neighborhood points to obtain principal component vectors, where the principal directions are usually normal vectors.

[0214] ②Poisson Equation

[0215] The core of the Poisson reconstruction algorithm is to transform the normal vector field into a three-dimensional surface using the Poisson equation. The discrete form of the Poisson equation is:

[0216]

[0217] Where N is the normal vector field, and ρ(x) is the density distribution of the point cloud. The goal of the algorithm is to compute a smooth surface function φ(x) from the divergence field ρ(x), i.e.:

[0218] Δφ=ρ(x)

[0219] Where Δ is the Laplacian operator, φ(x) represents the surface to be calculated, and the equation is finally solved numerically to obtain the three-dimensional surface of the point cloud.

[0220] ③ Numerical solution and mesh generation

[0221] The Poisson equation is typically solved using numerical methods, such as the finite difference method or the conjugate gradient method. These methods allow us to estimate the surface location from point cloud data. Then, a triangulation algorithm (such as Delaunay triangulation) is used to convert the estimated surface into a triangular mesh, which can be used for subsequent 3D rendering and analysis.

[0222] ④ Grid optimization

[0223] The initial mesh obtained through Poisson reconstruction may contain small isolated areas or lack smooth local details, thus requiring optimization. Common optimization methods include:

[0224] Removing small isolated faces: Faces with excessively small areas can be removed through connectivity analysis.

[0225] Smoothing: The mesh is further smoothed using a mesh smoothing algorithm (such as Laplacian smoothing).

[0226] 2. Octree Depth Control

[0227] In the Poisson reconstruction algorithm, an octree structure is used to manage the spatial distribution of point cloud data. The key role of the octree is to effectively organize the point cloud data by recursively dividing the space into multiple subspaces, thereby improving reconstruction efficiency. An octree is a hierarchical tree-like data structure primarily used to represent regions in three-dimensional space. Each octree node represents a cubic region, and the node is recursively divided into eight child nodes. Each child node contains the point data within that spatial region. By adjusting the depth of the octree, the fineness of the mesh can be controlled. A greater depth results in a more detailed division of space, with each region of the point cloud having higher resolution, thus generating a finer mesh; conversely, a smaller depth produces a coarser mesh, but reduces computational cost.

[0228] Choosing the depth of an octree

[0229] ① Larger depth: For dense and complex point clouds, increasing the depth of the octree can ensure richer mesh details and adapt to complex geometries. However, excessive depth will result in a large amount of computation and memory consumption.

[0230] ② Smaller depth: For sparse or relatively simple point cloud data, a smaller depth is sufficient to generate a coarse but sufficiently accurate mesh. A smaller depth reduces the consumption of computational resources and is suitable for simpler scenarios.

[0231] A reasonable depth selection needs to take into account the density and quality of the point cloud, as well as the accuracy requirements of the final application.

[0232] VI. Remove isolated small segments from the model

[0233] In point cloud models, there may be isolated small fragments or noise that need to be removed through connectivity analysis.

[0234] 1. Use PyMeshLab filters to remove isolated small fragments.

[0235] In point cloud models, there may be isolated small fragments or noise that typically do not contribute to the final 3D modeling and may even interfere with subsequent processing. PyMeshLab provides a filter that allows these small fragments to be removed by calculating the diameter of connected components.

[0236] How to do it: Use PyMeshLab

[0237] The `ms.meshing_remove_connected_component_by_diameter()` function determines whether to remove a small component based on its diameter. The algorithm first identifies all connected components in the point cloud and calculates the maximum diameter of each component. If the diameter of a component is smaller than a set threshold (e.g., 0.5 meters), it is considered an isolated small fragment or noise and is removed. This ensures that only parts closely related to the target structure are ultimately retained.

[0238] 2. Use graph algorithms (Depth-First Search / Breadth-First Search) to identify connected components.

[0239] Besides using the functions provided by PyMeshLab directly, graph algorithms can also be used to analyze the connectivity of point cloud data. Treating the point cloud as nodes in a graph, an edge can be considered to exist between any two points when the distance between them is less than a certain threshold. Connected components in the graph can be identified using depth-first search (DFS) or breadth-first search (BFS) algorithms.

[0240] Operation method: First, construct the distance matrix of the point cloud and calculate the distance between every two points. Next, use a depth-first search (DFS) or breadth-first search (BFS) algorithm to traverse these points and identify all connected point sets. Then, for each connected component, calculate its maximum diameter (i.e., the distance between the farthest points within the component). If the diameter of a component is less than a set threshold, the component is considered an isolated small segment and is removed from the point cloud.

[0241] 3. Further simplify the model based on distance and diameter constraints.

[0242] To further improve processing efficiency and remove redundant data, diameter constraints can be added when calculating connectivity and component diameters. This effectively simplifies the model, retaining only those sufficiently large connected components.

[0243] Operation method: First, by calculating the distance between points, determine which points belong to the same connected component. Then, based on the component's diameter constraint, remove connected components with a diameter smaller than a threshold. In this way, only the truly meaningful parts within the target region are retained in the point cloud, reducing the number of invalid points.

[0244] These three methods help remove noise and isolated small fragments from point cloud models. PyMeshLab's filters are the most direct tool, removing small components by using a diameter threshold; graph algorithms further enhance connectivity analysis, enabling more precise identification and removal of isolated parts; finally, by combining diameter constraints and distance calculations, the model is further optimized to remove redundant data.

[0245] 7. Check the water tightness of the model

[0246] Ensuring the generated mesh model is watertight means that the model surface is continuous, without holes or cracks, making it suitable for subsequent physical simulations, analysis, and virtual modeling. Watertightness is an important criterion for generating effective 3D models, especially in engineering simulations such as finite element analysis (FEA), where watertightness ensures that the model has no faulty surfaces, boundaries, or fractures.

[0247] Here are the detailed steps:

[0248] 1. Use the compute_volume function in PyMeshLab to check the watertightness of the model.

[0249] PyMeshLab provides a function for checking the watertightness of a mesh: `compute_volume()`. The core idea of this function is to determine the presence of holes or cracks by calculating the volume of the model. If the mesh has defects, the volume calculation will result in errors, typically returning illogical negative values or zero. Conversely, if the returned volume is positive and reasonable, it indicates that the mesh is closed and watertight.

[0250] Volume is calculated by integrating the geometry of the model's surface. In three-dimensional space, a closed mesh model represents a closed space. When calculating volume, if the mesh surface is continuous and without holes, the volume calculated by integration should be positive and reasonable. If the mesh has cracks or openings, the volume calculation will be incorrect due to the missed voids.

[0251] Operating steps:

[0252] ① Call the compute_volume function: First, use the compute_volume() function in PyMeshLab to calculate the volume of the model. PyMeshLab will automatically check the boundary conditions of the model.

[0253] If the returned volume is valid and positive, it indicates that the model is watertight.

[0254] If the return value is negative or zero, it indicates that the model may have cracks or holes.

[0255] ② Volume verification: The returned volume value should match the expectation. Generally, larger models should have larger volumes, and negative or zero values should not appear. At this point, further check whether there are any discontinuous parts in the model.

[0256] 2. Check the model boundaries

[0257] If the volume value returned by compute_volume() is not as expected (e.g., it returns a negative value or zero), further inspection of the model's surfaces is required, especially the mesh boundaries. The following are the inspection and repair steps:

[0258] Operating steps:

[0259] ① Visualize the model: Use 3D visualization tools (such as MeshLab, ParaView, etc.) to load the mesh model and examine it in detail, focusing on the boundary areas of the model. Common inspection methods include: zooming in to examine areas where cracks or voids may exist; observing whether the model surface is smooth and free of sharp edges or protrusions.

[0260] ② Inspect for cracks and holes: Check for invisible micro-cracks or holes by comparing the distribution of the mesh on its outer surface and inside. These cracks may be caused by noise during the reconstruction process or incomplete data acquisition.

[0261] ③ Manually repair defects: If cracks or holes are found, they can be repaired using mesh repair tools (such as the "Fill Holes" tool in MeshLab or the patch function in Blender). When manually repairing, the boundaries of the model can be reinforced to ensure that each facet is correctly stitched together.

[0262] ④ Refine the mesh: If there are large gaps or discontinuities at the model boundary, use a mesh refinement or smoothing algorithm to further optimize the model surface and ensure surface continuity.

[0263] ⑤ Recalculate the volume: After repairing the model, call the `compute_volume()` function again to ensure that the modified model volume is valid and positive. If the volume calculation is normal, it means that the model repair was successful and the watertightness is guaranteed.

[0264] 3. Further repair and optimization of the model.

[0265] After volumetric checks and manual adjustments, the model's watertightness may have been initially guaranteed, but further optimization may be needed to ensure high-quality model output. The following methods can be used as a reference:

[0266] ① Use the closing operation: Some mesh repair tools have an automatic mesh closing operation. These tools can fill gaps on the mesh surface and repair gaps in the model.

[0267] ② Mesh simplification and smoothing: By performing mesh simplification and smoothing operations, excessive details are removed and the edges of the mesh are smoothed to eliminate potential discontinuities.

[0268] ③ Reconstructing normal vectors and surface smoothing: For the repaired model, the normal vectors need to be recalculated to ensure that the surface orientation is consistent. Correctly repairing the normal vectors avoids simulation problems caused by reversed normals.

[0269] 8. Mesh to Topology Conversion

[0270] Mesh-to-topology conversion transforms a generated 3D mesh model into a topology suitable for use by finite element analysis (FEA) software, especially for importing into engineering simulation tools such as ABAQUS for physical simulations. The key to this process is ensuring a smooth transition from a meshed geometric model to a mathematically accurate and physically realistic topological model for subsequent analysis. Common conversion methods include using the OpenCASCADE kernel for geometric processing and exporting the model as a STEP file for reading by software such as ABAQUS.

[0271] 1. OpenCASCADE kernel conversion

[0272] OpenCASCADE is an open-source CAD (Computer-Aided Design) kernel widely used in geometric modeling, topology analysis, and transformation tasks. Its main function is to extract accurate geometric surfaces and solid models from mesh models and convert these models into mathematical representations more suitable for finite element analysis. OpenCASCADE can effectively handle non-manifold structures (such as discontinuous surfaces or irregular geometry) in mesh models, transforming them into clear solid models.

[0273] Conversion steps:

[0274] ① Mesh to Geometric Model Conversion: Mesh models typically consist of discrete triangular mesh patches that represent approximate surfaces of an object. However, such surfaces may not perfectly conform to strict geometric definitions. Using OpenCASCADE, the discrete patches in the mesh model are first converted into continuous geometric surfaces using the BRep (boundary representation) function library. For example, OpenCASCADE can fit meshed triangular patches to accurate NURBS (non-uniform rational B-spline) surfaces or other suitable geometric representations.

[0275] ② Handling Non-manifold Structures: Non-manifold structures in a mesh model refer to complex boundary conditions, such as multiple faces sharing an edge or a point, which may lead to geometric discontinuities or physically unresolvable structures. OpenCASCADE provides various algorithms for automatically identifying and repairing these non-manifold parts. Through topology repair, OpenCASCADE can convert these irregular mesh structures into more standard geometric and topological representations.

[0276] ③ Solid and Surface Extraction: OpenCASCADE uses solid extraction (such as basic geometric shapes like cubes, cylinders, and spheres) and surface extraction (such as the plane of a polyhedron) functions to transform meshed surface and solid structures into real mathematical geometric objects, which are more suitable for engineering analysis.

[0277] ④ Conversion effect: Through the processing of OpenCASCADE, the final generated geometric model will meet the strict requirements of mathematics and physics for geometric models, and can perform higher-precision finite element analysis, no longer limited by discrete patches in the mesh model.

[0278] 2. Generate STEP file

[0279] STEP (Standard for the Exchange of Product Model Data) is an international standard data exchange format used to transfer geometric and topological information between different CAD and CAE software. The STEP file format is widely used in engineering applications because it can accurately describe the geometry and topology of objects and supports various geometric representations such as NURBS and BRep.

[0280] By exporting the processed geometric model as a STEP file, the compatibility of the model across different software platforms can be ensured, especially the ability to import the model into simulation tools such as ABAQUS.

[0281] Steps to generate a STEP file:

[0282] ① Convert to standardized geometric format: In OpenCASCADE, geometric and topological structures have been converted into standard mathematical representations. At this point, the STEPWriter class provided by OpenCASCADE can be used to save the converted geometric and topological data as a STEP file.

[0283] ② Export Topology: Using the STEPWriter class, specify the output file path to export a STEP file containing geometric information and topology. This file will contain all the geometric data of the object (such as surfaces and solids) and the topological relationships between them (such as the connection of patches, the combination of solids, etc.).

[0284] ③ Ensure file compatibility: During the export process, the structure of the STE file should conform to the standard STEP file specification (such as ISO 10303-21) to ensure that the file can be correctly read in simulation software such as ABAQUS. The generated STEP file will contain all the necessary geometric information, such as surfaces, boundaries, and solids, and will be in a format that can be recognized and processed by ABAQUS.

[0285] ④ Check the output STEP file: Load the generated STEP file using CAD software or CAE tools to ensure the geometric and topological information is correct. Check if the geometric objects in the file are correct and if there is any redundant or erroneous data. Ensure the model has no errors or redundant geometry.

[0286] ⑤ Advantages of the STEP file format: The STEP file format has excellent cross-platform compatibility and is widely used in various engineering and simulation software. Because the STEP format can contain complex geometric information and topological structures, the exported model can be directly used in finite element analysis software such as ABAQUS without further conversion.

[0287] 3. Import into ABAQUS for finite element analysis.

[0288] Once a standard STEP file is generated and its geometry and topology meet the requirements, it can be imported into ABAQUS for subsequent finite element analysis. ABAQUS will automatically recognize the geometry in the STEP file and perform mesh generation and material property assignment based on the model's topological relationships.

[0289] During the import process, ABAQUS will automatically convert the geometric information in the STEP file into a finite element mesh suitable for simulation. Users can then further refine the mesh, adjust material properties, and define boundary conditions as needed.

[0290] 9. Import the STEP model into ABAQUS

[0291] Importing the STEP model into ABAQUS for further modeling and simulation analysis is a crucial step in the entire engineering simulation process. By importing the converted STEP file into ABAQUS, the geometric model and topology can be seamlessly used for simulation analysis. The following are detailed steps.

[0292] 1. Import STEP files into ABAQUS

[0293] Importing STEP files into ABAQUS is the first step in ensuring that the geometric model and topology are correctly loaded and recognized. ABAQUS supports direct import of STEP format files, which can parse the geometric information of the model and construct a topology suitable for further analysis.

[0294] Step-by-step instructions:

[0295] ① Start ABAQUS: Open ABAQUS / CAE and create a new model database (File>New ModelDatabase).

[0296] ② Import the STEP file: In ABAQUS / CAE, select File > Import > Part. In the pop-up file selection window, select the previously saved STEP file (.stp or .step format). ABAQUS will automatically read the file and parse its geometric data, including the model's surfaces, solids, and their topological relationships.

[0297] ③ Check the model: After importing, ABAQUS will display the geometry of the model. Check if the model is loaded correctly and ensure that all geometric elements (such as faces, edges, points, etc.) have been correctly identified. If any errors or warnings occur during the import process, you may need to manually repair them or check whether the STEP file conforms to the standard or contains complex non-manifold structures.

[0298] 2. Assign material properties and loading conditions to the tree model.

[0299] After importing the geometric model into ABAQUS, the next step is to assign appropriate material properties and loading conditions to the model. Correct material properties and loading conditions are prerequisites for ensuring accurate simulation results.

[0300] Step-by-step instructions:

[0301] ① Define material properties: In ABAQUS, select the Property module and create a new material.

[0302] Assign appropriate material properties to trees based on their physical characteristics (such as density, elastic modulus, Poisson's ratio, etc.). For the tree material model, a simple elastic material model can be chosen, or a more complex nonlinear model can be defined as needed. Elastic modulus (E): Related to the stiffness of the tree material. Poisson's ratio (ν): Describes the lateral deformation characteristics of the tree under tension or compression. Density (ρ): Affects the calculation of the tree's mass and is applicable to problems involving gravity loading.

[0303] ② Apply loading conditions: In ABAQUS, select the Load module and apply appropriate loading conditions to the model. Apply static or dynamic loads according to the simulation requirements. For example: force loads (such as wind, gravity, etc.) act on the top of the tree or a specific area. Pressure loads, such as soil pressure at the roots or certain parts of the tree.

[0304] ③ Select appropriate boundary conditions, such as fixing the bottom of the tree and restricting certain degrees of freedom of the trunk, to ensure that the model can correctly simulate the actual stress conditions.

[0305] ④ Define Contact and Constraints: If there is contact between multiple components (e.g., the contact between a tree and the ground), contact pairs need to be defined in ABAQUS. Select the appropriate contact pair in the Interaction module and define parameters such as friction coefficient and contact mechanics to ensure that the interaction between the tree and other components is correctly simulated.

[0306] 3. Mesh Generation and Analysis

[0307] Meshing is the process of discretizing a geometric model, with the aim of transforming a continuous physical problem into a finite element model. Appropriate mesh generation can ensure analytical accuracy and influence computational efficiency.

[0308] Step-by-step instructions:

[0309] ① Create a mesh: In the Mesh module, select the model parts that need to be meshed (such as different parts of a tree).

[0310] ② Choosing the mesh type: For tree models, the mesh generation is typically chosen using tetrahedral meshes (Tet) or hexahedral meshes (Hex), depending on the complexity and accuracy requirements of the model. Tetrahedral meshes are suitable for complex geometries, but are generally less computationally efficient. Hexahedral meshes are more accurate for regular shapes (such as cylinders), but may require further refinement for complex models.

[0311] ③ Set mesh density: Adjust the mesh size according to the simulation accuracy requirements. A denser mesh improves computational accuracy but significantly increases computational cost; a sparser mesh may lead to insufficient accuracy.

[0312] ④ Check mesh quality: After meshing, use the mesh quality check tool provided by ABAQUS to check the mesh quality and ensure that the mesh is not severely distorted or excessively deformed. Adjust the size and shape of the mesh appropriately to ensure that the mesh is sufficiently fine in critical areas (such as stress concentration regions).

[0313] ⑤ Select the analysis type: Based on the stress conditions of the tree model, select the appropriate analysis type. For example, static analysis is suitable for studying the static deformation or failure of trees under external forces; while dynamic analysis is suitable for simulating the vibration, swaying, or response of trees to dynamic loads (such as wind). Select the appropriate time step and solution method to set the analysis type.

[0314] ⑥ Run the simulation analysis: Create a new simulation task in the Job module, set the solution parameters, and select a suitable solver (such as a linear statics solver, explicit dynamics solver, etc.). Submit the job to begin the calculation and analysis. ABAQUS will perform finite element analysis based on the defined mesh and material model and generate the results.

[0315] ⑦ Analysis Results: After completing the analysis, view the results file. Common results include deformation diagrams, stress fields, strain fields, and displacements. The results can be visualized in the Visualization module, and the behavior of trees under different loads can be analyzed, such as tree displacement and stress concentration locations.

[0316] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0317] The above description is merely a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any person skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope disclosed in the present invention, and these modifications or substitutions should all be covered within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims

1. A tree modeling method based on UAV-acquired images and image recognition, characterized in that, Includes the following steps: Step S1: Plan the drone's flight mode and flight path based on the size, shape, and tree distribution of the target area; Step S2: Using the planned flight mode and flight path, the drone flies over the target area to collect multi-angle image data of trees. The multi-angle image data includes a set of tree images taken from multiple different viewpoints and heights. Step S3: Preprocess the multi-angle image data, including denoising, cropping, image enhancement, and standardization; Step S4: Input the preprocessed multi-angle image data into the pre-trained neural network model for tree recognition and semantic segmentation, and output the semantic segmentation map of each tree image. Each pixel in the semantic segmentation map is assigned a semantic label. The neural network model is the Mask R-CNN model. Step S5: Use the SIFT algorithm to extract feature points and their descriptors from each tree image, use the nearest neighbor matching algorithm to match feature points between multiple tree images based on the feature points and their descriptors of each tree image, and use the RANSAC algorithm to remove mismatched feature points. Step S6: Based on the matched feature points, construct a preliminary 3D point cloud model using COLMAP through an incremental reconstruction method; Step S7: Calculate the 3D coordinates of each pixel in the semantic segmentation map obtained in step S4 in the preliminary 3D point cloud model, determine the point cloud data corresponding to each pixel based on the 3D coordinates of each pixel, and pass the semantic label of each pixel to the corresponding point cloud data. Step S8: Traverse each point cloud data in the preliminary 3D point cloud model, segment the preliminary 3D point cloud model according to the semantic tags of the point cloud data, and obtain the tree 3D point cloud model. Step S9: Use the SOR filtering algorithm to denoise and optimize the 3D point cloud model of the tree. Obtain the 3D mesh model of the tree by using the Poisson reconstruction method after denoising and optimization. Construct the geometric model of the tree based on the 3D mesh model of the tree. The loss function of the Mask R-CNN model is: in, The loss function for the Mask R-CNN model is... This is the loss weight matrix of the Mask R-CNN model, representing the loss weights between each predicted semantic label and the ground truth semantic label. The first image of the tree 1 pixel For the first The true semantic label of each pixel The output of the Mask R-CNN model Predicted semantic labels for each pixel For the Mask R-CNN model for the first The probability that a pixel belongs to the predicted semantic label. The total number of pixels in the tree image, with a background label of 0, a trunk label of 1, and a leaf label of 2; The loss weight matrix of the Mask R-CNN model for: Where rows represent true semantic labels, columns represent predicted semantic labels, and values represent the loss weights between these semantic labels; Step S7 includes the following steps: Based on the two-dimensional coordinates of each pixel in the semantic segmentation map Using the camera's intrinsic and extrinsic parameters, back-projection is performed through a projection matrix to calculate the 3D coordinates of the pixel in the preliminary 3D point cloud model. in, These are the 3D coordinates of the pixels in the initial 3D point cloud model. For the depth information of the pixels, Let be the intrinsic parameter matrix of the camera. for: in, , Let be the camera's focal length, and be the scaling factors along the x and y axes, respectively. , The coordinates of the camera's principal point; Find the point cloud data point that is closest to the three-dimensional coordinates of the pixel in the preliminary three-dimensional point cloud model; The semantic label of a pixel is passed to the nearest point cloud data point.

2. The tree modeling method based on UAV image acquisition and image recognition according to claim 1, characterized in that, The semantic tags include the tree trunk, leaves, and background.

3. The tree modeling method based on UAV image acquisition and image recognition according to claim 1, characterized in that, The training process of the Mask R-CNN model includes the following steps: Acquire a collection of tree images taken from multiple different perspectives and heights, and preprocess the tree image collection, including denoising, cropping, image enhancement, and normalization; The image labeling tool LabelMe was used to manually label each region in the preprocessed tree image set. The parts of the tree trunk that were covered by leaves were labeled as leaves. Each pixel in the tree image was assigned a label and a corresponding segmentation mask was generated. The labeled tree image was then converted into a binary mask image. The preprocessed set of tree images and the binary mask image are input into the Mask R-CNN model for training. The output is a binary mask image of the tree images. Based on the output binary mask image and the labeled binary mask image, the loss of each pixel is calculated using a loss function. The calculated total loss value is used to update the model parameters through the backpropagation algorithm.

4. The tree modeling method based on UAV image acquisition and image recognition according to claim 1, characterized in that, The process of using the nearest neighbor matching algorithm to match feature points among multiple tree images based on local feature points of each tree image includes the following steps: Feature points and descriptors are extracted from each image. The feature points are identified by the Scale Invariant Feature Transform (SIFT) algorithm, and the descriptors are 128-dimensional vectors for each feature point. For every two tree images, descriptors are matched, the Euclidean distance between each pair of descriptors is calculated, and each feature point is matched based on the nearest Euclidean distance to obtain matching point pairs.

5. A tree modeling method based on UAV-acquired images and image recognition according to claim 1, characterized in that, The preliminary 3D point cloud model is constructed using COLMAP through an incremental reconstruction method based on matched feature points, including the following steps: Step A1: Calculate the relative pose between images based on the matched feature point pairs; Step A2: Based on the relative pose and matching feature point pairs, calculate the preliminary 3D point cloud through triangulation; Step A3: Add at least one image to the reconstruction process, and calculate the relative pose of the new image using the matching feature point pairs between the new image and the existing 3D point cloud. Step A4: Based on the pose and matching feature points of the new image, update the 3D point cloud through triangulation and perform biceps adjustment to optimize the 3D point cloud and camera pose; Step A5: Repeat steps A3 and A4 to gradually update and optimize the 3D point cloud and generate a preliminary 3D point cloud model.

6. The tree modeling method based on UAV image acquisition and image recognition according to claim 1, characterized in that, The process of traversing each point cloud data in the preliminary 3D point cloud model and segmenting the preliminary 3D point cloud model according to the semantic tags of the point cloud data to obtain the tree 3D point cloud model includes the following steps: Traverse the point cloud data without semantic labels in the initial 3D point cloud model, calculate the point cloud data that is closest to it, and set the semantic label of the point cloud data that is closest to it as the semantic label of the point cloud data that is currently being traversed without semantic labels. Traverse each point cloud data in the initial 3D point cloud model, delete point cloud data with semantic labels as background, and obtain the 3D point cloud model of the tree.

7. The tree modeling method based on UAV image acquisition and image recognition according to claim 1, characterized in that, The method of using the SOR filtering algorithm to optimize the noise reduction of the 3D point cloud model of trees includes the following steps: For each point cloud data in the 3D point cloud model of the tree Define its neighborhood The neighborhood Representing point cloud data Other point cloud data sets within a certain radius; Calculate each point cloud data All points in the neighborhood With point cloud data Euclidean distance The formula is: in, Point cloud data With point cloud data Three-dimensional coordinates in a 3D point cloud model of a tree; Calculate each point Mean distance of neighboring points and standard deviation The formula is: in, For point cloud data The number of points in the neighborhood; Based on each point cloud data Mean distance and standard deviation Determine the point cloud data Whether a point cloud data point is an outlier is determined by the following conditions: Outlier: in, This is a preset threshold factor; Point cloud data identified as outliers are removed from the 3D point cloud model of the tree to obtain a noise-reduced and optimized 3D point cloud model of the tree.