A CUDA acceleration-based point cloud image fast density clustering method

CN122244479APending Publication Date: 2026-06-19NORTHEASTERN UNIV AT QINHUANGDAO

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
NORTHEASTERN UNIV AT QINHUANGDAO
Filing Date
2025-06-06
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Traditional density clustering algorithms have high computational complexity and slow processing speed when processing high-resolution, large-scale point cloud images, making it difficult to meet real-time requirements. Furthermore, they consume too many resources when executed on the CPU, resulting in response latency issues.

Method used

A fast density clustering method for point cloud images based on CUDA acceleration is adopted. By performing intensive computing tasks in parallel on the GPU, local neighborhoods are constructed and core point identification and initial cluster construction are performed. The cluster merging process is optimized by using a disjoint-set data structure, and normalized spatial distance is used to measure similarity. The cluster relationships are managed by combining neighborhood partitioning and conflict matrix.

Benefits of technology

It significantly reduces computational complexity, improves parallelism and execution speed, and enhances clustering accuracy, making it suitable for large-scale, real-time 3D vision processing scenarios.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122244479A_ABST
    Figure CN122244479A_ABST
Patent Text Reader

Abstract

This invention discloses a CUDA-accelerated fast density clustering method for point cloud images, applicable to dense point cloud image data acquired by binocular cameras. The method first acquires the point cloud image and divides it into local neighborhoods centered on each pixel. Next, based on the normalized spatial distance between pixels within a neighborhood, core points are identified and clusters are initialized, grouping pixels whose distance to the core point is less than a threshold into the same cluster. Subsequently, by constructing a conflict matrix between neighborhoods, overlapping relationships between different clusters are identified. A disjoint-set data structure is used to merge conflicting clusters, and neighborhoods are merged iteratively round by round, halving the number of neighborhoods in each round, ultimately completing global clustering. This method introduces a local neighborhood partitioning and disjoint-set cluster merging mechanism, effectively reducing global computational complexity. Combined with CUDA parallel acceleration, it significantly improves clustering speed, making it suitable for real-time processing applications of large-scale point cloud images.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of image processing and computer vision technology, and in particular to a fast density clustering method for point cloud images based on CUDA acceleration. Background Technology

[0002] Point cloud images, as an important data form for describing three-dimensional spatial scenes, are widely used in various vision-based perception systems such as intelligent robots. In practical applications, in order to extract useful target information from raw point cloud images, it is often necessary to perform cluster analysis on the pixels in the point cloud image to identify objects or regions in space. Traditional density clustering methods, such as DBSCAN, perform well when processing low-dimensional data, but when dealing with high-resolution, large-scale point cloud images, they suffer from high computational complexity, slow processing speed, and difficulty in meeting real-time requirements.

[0003] Meanwhile, point cloud images typically have dense three-dimensional structural characteristics. Clustering requires calculating the neighborhood relationships and distance metrics between a large number of pixels in the spatial dimension. This results in a significant computational bottleneck when executing traditional clustering algorithms on a CPU. In particular, the process of constructing a neighborhood for each pixel and performing distance judgments and updating cluster numbers can easily lead to problems such as excessive resource consumption and long response delays.

[0004] In recent years, GPU parallel computing technology, especially NVIDIA's CUDA (Compute Unified Device Architecture), has brought new acceleration methods to the field of image processing. By executing intensive computing tasks in parallel on GPUs, the processing efficiency of clustering algorithms can be significantly improved. However, how to efficiently map density clustering algorithms to the CUDA architecture and solve the data dependency and synchronization problems in neighborhood partitioning, cluster labeling, and conflict merging remain challenges in current research and engineering practice.

[0005] Therefore, there is an urgent need for a fast density clustering method that can fully utilize the parallel computing capabilities of GPUs and is suitable for the characteristics of point cloud image processing, so as to achieve the requirements of efficient and stable real-time clustering processing. Summary of the Invention

[0006] This invention provides a fast density clustering method for point cloud images based on CUDA acceleration, for point cloud data captured by a stereo camera.

[0007] This invention adopts the following technical solution: a fast density clustering method for point cloud images based on CUDA acceleration, comprising the following steps:

[0008] S1: Point cloud image acquisition: Use a binocular camera to acquire point cloud images and store the 3D coordinate data obtained after preprocessing the point cloud images into the GPU memory;

[0009] S2: Neighborhood division: Construct a neighborhood region by expanding a preset pixel range outward from each pixel in the image as the center;

[0010] S3: Marking the core point and constructing the initial cluster: In each neighborhood, calculate the normalized spatial distance between the center pixel and the neighboring pixels. If the number of pixels in the neighborhood whose distance from the center pixel is less than a set threshold is greater than or equal to the density threshold, then mark the center pixel as the core point, and classify the pixels in the neighborhood whose distance from the center pixel is less than the set threshold into the same initial cluster.

[0011] S4: Constructing the conflict matrix: For each row in the image, if there are multiple neighborhood regions, adjacent neighborhoods are processed in pairs (e.g., the 1st and 2nd neighborhoods, the 3rd and 4th neighborhoods), and their overlapping areas are extracted as merging judgment areas; by analyzing the cluster numbers of the pixels in the overlapping area, the cases where pixels at the same position belong to different clusters are identified, and such conflict pairs are recorded in the conflict matrix to represent the association between each cluster; if a row contains only one neighborhood, then adjacent rows in the image are processed in the same way (e.g., the 1st and 2nd rows, the 3rd and 4th rows), and the conflict matrix between vertical neighborhoods is constructed in the same way;

[0012] S5: Cluster Expansion: Based on the conflict matrix constructed in step S4, a disjoint-set data structure is constructed to efficiently manage the connection and merging relationships between clusters; the disjoint-set algorithm is used to uniformly number the clusters in two neighborhoods, thereby realizing the merging and expansion of associated clusters in different neighborhoods;

[0013] S6: Neighborhood Iterative Merging: Repeat steps S4 and S5. In each iteration, all current neighborhoods are merged in pairs in the order of row priority and column priority. During the merging process, a conflict matrix is ​​constructed and a cluster expansion operation is performed. After each iteration, the number of neighborhoods is halved, and finally they are merged into a global neighborhood, thus completing the clustering of the entire image.

[0014] Preferably, in step S1, the point cloud image preprocessing process is as follows: extract the three-dimensional spatial coordinate information corresponding to each pixel; organize the point cloud image into an H×W×3 data structure according to its height (H), width (W) and corresponding three-dimensional coordinates (x, y, z).

[0015] Preferably, the neighborhood partitioning method described in step S2 is specifically implemented as follows:

[0016] S2.1. Create a one-dimensional array of size H×W×N to store the cluster number information corresponding to each pixel in the neighborhood of each pixel; where H and W are the height and width of the image, respectively, and N is the number of pixels contained in each neighborhood; the array accesses the neighborhood information of each pixel in a one-dimensional linearized manner, and the specific index calculation formula is: (h×W+ w)×N + n.

[0017] S2.2 Create a one-dimensional array of size H×W to mark whether each center pixel is a core point. A value of 1 indicates a core point and 0 indicates a non-core point.

[0018] Preferably, the method for constructing the marked core points and initial clusters in step S3 is specifically implemented as follows:

[0019] S3.1 Calculate the normalized spatial distance between each pixel and its neighboring pixels in parallel under the CUDA environment. The normalized spatial distance is defined as the three-dimensional spatial distance between two pixels divided by the average distance from the two pixels to the camera.

[0020] S3.2 Count the number of pixels in the neighborhood whose normalized spatial distance from the center pixel is less than a set distance threshold. If the number is greater than or equal to a preset density threshold, mark the center pixel as a core point and set its value to 1 in the core point marking array described in claim 3. Otherwise, set it to 0.

[0021] S3.3. All pixels in the neighborhood that satisfy the normalized spatial distance being less than a set distance threshold are assigned to the initial cluster, and the corresponding cluster number is recorded in the one-dimensional cluster number array described in claim 3. For pixels belonging to the cluster, they are marked as 0 in the array, indicating that they have been assigned to the initial cluster, and the remaining pixels are marked as -1, indicating that they have not yet been assigned to the cluster.

[0022] All of the above steps are executed in a CUDA parallel environment, with each neighborhood assigned to an independent CUDA thread for parallel processing.

[0023] Preferably, in step S4, the specific implementation of constructing the conflict matrix is ​​as follows:

[0024] S4.1 For two adjacent neighborhoods, find the pixels in their overlapping area. If a core point is detected in the area that belongs to different clusters in the two neighborhoods, then these conflicting clusters are recorded in the conflict matrix. During this process, each pixel in the overlapping part of the neighborhood is assigned to an independent CUDA thread for parallel processing.

[0025] S4.2 For each overlapping neighborhood region, there is a conflict matrix. The dimension of each conflict matrix is ​​N×M, where N is the number of clusters in the first neighborhood and M is the number of clusters in the second neighborhood. If the core pixel in a certain overlapping region is marked as cluster n in the first neighborhood and cluster m in the second neighborhood, then the n×M+m-th element of the conflict matrix will be set to 1, indicating that the two clusters have conflicted.

[0026] Preferably, in step S5, the specific implementation process of cluster expansion is as follows:

[0027] S5.1. Traverse the conflict matrix constructed in step S4, and regard the positions (n, m) marked as 1 in the matrix as cluster pairs (n, m) that need to be merged. Process them as connection edges in the disjoint set to establish the merging relationship between each cluster.

[0028] S5.2 Based on the above merging relationship, construct a disjoint-set data structure and adopt optimization strategies such as path compression to accelerate the search and merging speed of cluster numbers;

[0029] S5.3. Use the disjoint-set data structure algorithm to find the updated cluster number of each pixel and replace it with the representative cluster number to achieve unified merging and numbering between clusters.

[0030] Preferably, in step S5, to improve the parallel efficiency and overall processing performance of cluster number updates, a mapping relationship between the original cluster numbers and the merged new cluster numbers is first constructed based on a disjoint-set data structure, and this mapping relationship is cached in the GPU global memory. Subsequently, in a parallel environment, an independent CUDA thread is allocated to each pixel in the merged neighborhood, and its corresponding new cluster number is updated uniformly according to the mapping relationship. Except for the final cluster number update step, all other processes in step S5 involve allocating an independent CUDA thread to each neighborhood for parallel processing.

[0031] Preferably, through the iterative operation in step S6, the number of neighborhoods in the current image is halved each time a merging operation is performed, which improves the overall parallel processing efficiency and increases the running speed of the clustering process.

[0032] Preferably, through steps S1 to S6, this method utilizes the CUDA architecture to achieve efficient density clustering of point cloud images in the GPU, significantly improving parallelism and execution speed, and is suitable for large-scale scenarios with high real-time requirements.

[0033] Compared with the prior art, the present invention, employing the above technical solution, has the following technical effects:

[0034] 1. Significantly reduced computational complexity: This invention avoids comprehensive calculation of the distance between global pixels by constructing a local neighborhood and performing core point identification and initial clustering within the neighborhood, thereby significantly reducing the amount of computation.

[0035] 2. High parallelism: Steps such as neighborhood partitioning, core point labeling, and initial cluster construction can be executed simultaneously on multiple pixels, which facilitates deployment on parallel computing platforms such as CUDA and achieves efficient acceleration of the clustering process;

[0036] 3. Cluster merging mechanism with halving strategy: This invention performs cluster conflict detection and merging based on the overlap between neighborhoods, and reduces the number of neighborhoods by half in each round of processing, which effectively improves merging efficiency and accelerates the convergence of the final clustering results.

[0037] 4. Integrating Normalized Spatial Distance to Improve Clustering Accuracy: Due to perspective distortion in point cloud images acquired by binocular cameras, adjacent pixels farther from the camera have relatively larger actual distances in 3D space, making traditional Euclidean distance insufficient to accurately measure spatial similarity. Therefore, this invention introduces a normalized spatial distance index, adjusting the spatial similarity calculation method based on the depth information of points, thereby improving the accuracy and robustness of clustering. Attached Figure Description

[0038] Figure 1 This is a flowchart illustrating the steps of the CUDA-accelerated fast density clustering method for point cloud images in this invention.

[0039] Figure 2 This is a flowchart illustrating the cluster extension of the present invention.

[0040] Figure 3 This is a schematic diagram of the neighborhood iterative merging process of the present invention. Detailed Implementation

[0041] The specific embodiments of the present invention will be described in further detail below with reference to the accompanying drawings and specific embodiments. The following embodiments or drawings are used to illustrate the present invention, but are not intended to limit the scope of the invention.

[0042] This invention provides a CUDA-accelerated fast density clustering method for point cloud images. This method is designed for point cloud image data acquired by binocular cameras. By employing a CUDA parallel computing architecture on a GPU platform, it effectively improves the processing efficiency of point cloud clustering. The method combines the principles of density clustering algorithms, utilizes a disjoint-set data structure to optimize the cluster merging process, and implements efficient parallel processing for key steps such as conflict detection and cluster number updating, significantly improving the overall computational speed. It is suitable for 3D vision processing scenarios with high real-time requirements.

[0043] In one embodiment of the present invention, a fast density clustering method for point cloud images based on CUDA acceleration is described, such as... Figure 1 As shown, the steps are as follows:

[0044] S1: Point cloud image acquisition: Use a binocular camera to acquire point cloud images and store the 3D coordinate data obtained after preprocessing the point cloud images into the GPU memory;

[0045] S2: Neighborhood division: Construct a neighborhood region by expanding a preset pixel range outward from each pixel in the image as the center;

[0046] S3: Marking the core point and constructing the initial cluster: In each neighborhood, calculate the normalized spatial distance between the center pixel and the neighboring pixels. If the number of pixels in the neighborhood whose distance from the center pixel is less than a set threshold is greater than or equal to the density threshold, then mark the center pixel as the core point, and classify the pixels in the neighborhood whose distance from the center pixel is less than the set threshold into the same initial cluster.

[0047] S4: Constructing the conflict matrix: For each row in the image, if there are multiple neighborhood regions, adjacent neighborhoods are processed in pairs (e.g., the 1st and 2nd neighborhoods, the 3rd and 4th neighborhoods), and their overlapping areas are extracted as merging judgment areas; by analyzing the cluster numbers of the pixels in the overlapping area, the cases where pixels at the same position belong to different clusters are identified, and such conflict pairs are recorded in the conflict matrix to represent the association between each cluster; if a row contains only one neighborhood, then adjacent rows in the image are processed in the same way (e.g., the 1st and 2nd rows, the 3rd and 4th rows), and the conflict matrix between vertical neighborhoods is constructed in the same way;

[0048] S5: Cluster Expansion: Based on the conflict matrix constructed in step S4, a disjoint-set data structure is constructed to efficiently manage the connection and merging relationships between clusters; the disjoint-set algorithm is used to uniformly number the clusters in two neighborhoods, thereby realizing the merging and expansion of associated clusters in different neighborhoods;

[0049] S6: Neighborhood Iterative Merging: Repeat steps S4 and S5. In each iteration, all current neighborhoods are merged in pairs in the order of row priority and column priority. During the merging process, a conflict matrix is ​​constructed and a cluster expansion operation is performed. After each iteration, the number of neighborhoods is halved, and finally they are merged into a global neighborhood, thus completing the clustering of the entire image.

[0050] Preferably, in step S1, the point cloud image preprocessing process is as follows: extract the three-dimensional spatial coordinate information corresponding to each pixel; organize the point cloud image into an H×W×3 data structure according to its height (H), width (W) and corresponding three-dimensional coordinates (x, y, z).

[0051] Preferably, the neighborhood partitioning method described in step S2 is specifically implemented as follows:

[0052] S2.1. Create a one-dimensional array of size H×W×N to store the cluster number information corresponding to each pixel in the neighborhood of each pixel; where H and W are the height and width of the image, respectively, and N is the number of pixels contained in each neighborhood; the array accesses the neighborhood information of each pixel in a one-dimensional linearized manner, and the specific index calculation formula is: (h×W+ w)×N + n.

[0053] S2.2 Create a one-dimensional array of size H×W to mark whether each center pixel is a core point. A value of 1 indicates a core point and 0 indicates a non-core point.

[0054] Preferably, the method for constructing the marked core points and initial clusters in step S3 is specifically implemented as follows:

[0055] S3.1 Calculate the normalized spatial distance between each pixel and its neighboring pixels in parallel under the CUDA environment. The normalized spatial distance is defined as the three-dimensional spatial distance between two pixels divided by the average distance from the two pixels to the camera.

[0056] S3.2 Count the number of pixels in the neighborhood whose normalized spatial distance from the center pixel is less than a set distance threshold. If the number is greater than or equal to a preset density threshold, mark the center pixel as a core point and set its value to 1 in the core point marking array described in claim 3. Otherwise, set it to 0.

[0057] S3.3. All pixels in the neighborhood that satisfy the normalized spatial distance being less than a set distance threshold are assigned to the initial cluster, and the corresponding cluster number is recorded in the one-dimensional cluster number array described in claim 3. For pixels belonging to the cluster, they are marked as 0 in the array, indicating that they have been assigned to the initial cluster, and the remaining pixels are marked as -1, indicating that they have not yet been assigned to the cluster.

[0058] All of the above steps are executed in a CUDA parallel environment, with each neighborhood assigned to an independent CUDA thread for parallel processing.

[0059] Preferably, in step S4, the specific implementation of constructing the conflict matrix is ​​as follows:

[0060] S4.1 For two adjacent neighborhoods, find the pixels in their overlapping area. If a core point is detected in the area that belongs to different clusters in the two neighborhoods, then these conflicting clusters are recorded in the conflict matrix. During this process, each pixel in the overlapping part of the neighborhood is assigned to an independent CUDA thread for parallel processing.

[0061] S4.2 For each overlapping neighborhood region, there is a conflict matrix. The dimension of each conflict matrix is ​​N×M, where N is the number of clusters in the first neighborhood and M is the number of clusters in the second neighborhood. If the core pixel in a certain overlapping region is marked as cluster n in the first neighborhood and cluster m in the second neighborhood, then the n×M+m-th element of the conflict matrix will be set to 1, indicating that the two clusters have conflicted.

[0062] Preferably, in step S5, such as Figure 2 As shown, the specific implementation process of cluster expansion is as follows:

[0063] S5.1. Traverse the conflict matrix constructed in step S4, and regard the positions (n, m) marked as 1 in the matrix as cluster pairs (n, m) that need to be merged. Process them as connection edges in the disjoint set to establish the merging relationship between each cluster.

[0064] S5.2 Based on the above merging relationship, construct a disjoint-set data structure and adopt optimization strategies such as path compression to accelerate the search and merging speed of cluster numbers;

[0065] S5.3. Use the disjoint-set data structure algorithm to find the updated cluster number of each pixel and replace it with the representative cluster number to achieve unified merging and numbering between clusters.

[0066] Preferably, in step S5, to improve the parallel efficiency and overall processing performance of cluster number updates, a mapping relationship between the original cluster numbers and the merged new cluster numbers is first constructed based on a disjoint-set data structure, and this mapping relationship is cached in the GPU global memory. Subsequently, in a parallel environment, an independent CUDA thread is allocated to each pixel in the merged neighborhood, and its corresponding new cluster number is updated uniformly according to the mapping relationship. Except for the final cluster number update step, all other processes in step S5 involve allocating an independent CUDA thread to each neighborhood for parallel processing.

[0067] Preferred, such as Figure 3 As shown, in a 4×4 point cloud image, each dashed box represents a neighborhood region. When merging neighborhood regions horizontally, only the first row is used as an example; the operations in the remaining rows are performed synchronously. In the iterative loop of step S6, each merging operation reduces the number of neighborhoods in the current image by half. This strategy significantly improves the parallel efficiency of clustering processing, thereby accelerating the execution speed of the entire density clustering process.

[0068] Preferably, through steps S1 to S6, this method utilizes the CUDA architecture to achieve efficient density clustering of point cloud images in the GPU, significantly improving parallelism and execution speed, and is suitable for large-scale scenarios with high real-time requirements.

Claims

1. A CUDA acceleration-based point cloud image fast density clustering method, characterized in that, Includes the following steps: S1. Point cloud image acquisition: Use a binocular camera to acquire point cloud images and store the three-dimensional coordinate data obtained after preprocessing the point cloud images into the GPU memory; S2, Neighborhood Division: Construct a neighborhood region by expanding a preset pixel range outward from each pixel in the image as the center; S3. Construction of core points and initial clusters: In each neighborhood, calculate the normalized spatial distance between the center pixel and the neighboring pixels. If the number of pixels in the neighborhood whose distance from the center pixel is less than a set threshold is greater than or equal to the density threshold, then mark the center pixel as the core point, and classify the pixels in the neighborhood whose distance from the center pixel is less than the set threshold into the same initial cluster. S4. Constructing the conflict matrix: For each row in the image, if there are multiple neighborhood regions, adjacent neighborhoods are processed in pairs (e.g., the 1st and 2nd neighborhoods, the 3rd and 4th neighborhoods), and their overlapping areas are extracted as merging judgment areas. By analyzing the cluster numbers of the pixels in the overlapping area, the cases where pixels at the same position belong to different clusters are identified, and such conflict pairs are recorded in the conflict matrix to represent the association between each cluster. If a row contains only one neighborhood, then adjacent rows in the image are processed in the same way (e.g., the 1st and 2nd rows, the 3rd and 4th rows), and the conflict matrix between vertical neighborhoods is constructed in the same manner. S5. Cluster Expansion: Based on the conflict matrix constructed in step S4, construct a disjoint-set data structure to efficiently manage the connection and merging relationships between clusters. By using the disjoint-set data structure algorithm to assign unified numbers to clusters in two neighborhoods, the merging and expansion of associated clusters in different neighborhoods can be achieved. S6. Neighborhood Iterative Merging: Repeat steps S4 and S5. In each iteration, all current neighborhoods are merged in pairs in the order of row priority and column priority. During the merging process, a conflict matrix is ​​constructed and a cluster expansion operation is performed. After each iteration, the number of neighborhoods is halved, and finally they are merged into a global neighborhood, thus completing the clustering of the entire image.

2. The CUDA-accelerated point cloud image fast density clustering method according to claim 1, characterized in that: In step S1, the point cloud image preprocessing process is as follows: extract the three-dimensional spatial coordinate information corresponding to each pixel; organize the point cloud image into an H×W×3 data structure according to its height (H), width (W) and corresponding three-dimensional coordinates (x, y, z).

3. The CUDA-accelerated point cloud image fast density clustering method according to claim 1, characterized in that: The neighborhood partitioning method described in step S2 is specifically implemented as follows: S2.

1. Create a one-dimensional array of size H×W×N to store the cluster number information corresponding to each pixel in the neighborhood of each pixel; where H and W are the height and width of the image, respectively, and N is the number of pixels contained in each neighborhood; the array accesses the neighborhood information of each pixel in a one-dimensional linearized manner, and the specific index calculation formula is: (h×W +w)×N + n. S2.2 Create a one-dimensional array of size H×W to mark whether each center pixel is a core point. A value of 1 indicates a core point and 0 indicates a non-core point.

4. The fast density clustering method for point cloud images based on CUDA acceleration according to claim 1, characterized in that: The method for constructing the core points and initial clusters in step S3 is specifically implemented as follows: S3.1 Calculate the normalized spatial distance between each pixel and its neighboring pixels in parallel under the CUDA environment. The normalized spatial distance is defined as the three-dimensional spatial distance between two pixels divided by the average distance from the two pixels to the camera. S3.2 Count the number of pixels in the neighborhood whose normalized spatial distance from the center pixel is less than a set distance threshold. If the number is greater than or equal to a preset density threshold, mark the center pixel as a core point and set its value to 1 in the core point marking array described in claim 3. Otherwise, set it to 0. S3.

3. All pixels in the neighborhood that satisfy the normalized spatial distance being less than a set distance threshold are assigned to the initial cluster, and the corresponding cluster number is recorded in the one-dimensional cluster number array described in claim 3. For pixels belonging to the cluster, they are marked as 0 in the array, indicating that they have been assigned to the initial cluster, and the remaining pixels are marked as -1, indicating that they have not yet been assigned to the cluster. All of the above steps are executed in a CUDA parallel environment, with each neighborhood assigned to an independent CUDA thread for parallel processing.

5. The fast density clustering method for point cloud images based on CUDA acceleration according to claim 1, characterized in that: In step S4, the specific implementation of constructing the conflict matrix is ​​as follows: S4.1 For two adjacent neighborhoods, find the pixels in their overlapping area. If a core point is detected in the area that belongs to different clusters in the two neighborhoods, then these conflicting clusters are recorded in the conflict matrix. During this process, each pixel in the overlapping part of the neighborhood is assigned to an independent CUDA thread for parallel processing. S4.2 For each overlapping neighborhood region, there is a conflict matrix. The dimension of each conflict matrix is ​​N×M, where N is the number of clusters in the first neighborhood and M is the number of clusters in the second neighborhood. If the core pixel in a certain overlapping region is marked as cluster n in the first neighborhood and cluster m in the second neighborhood, then the n×M+m-th element of the conflict matrix will be set to 1, indicating that the two clusters have conflicted.

6. The fast density clustering method for point cloud images based on CUDA acceleration according to claim 1, characterized in that: In step S5, the specific implementation process of cluster expansion is as follows: S5.

1. Traverse the conflict matrix constructed in step S4, and regard the positions (n, m) marked as 1 in the matrix as cluster pairs (n, m) that need to be merged. Process them as connection edges in the disjoint set to establish the merging relationship between each cluster. S5.2 Based on the above merging relationship, construct a disjoint-set data structure and adopt optimization strategies such as path compression to accelerate the search and merging speed of cluster numbers; S5.

3. Use the disjoint-set data structure algorithm to find the updated cluster number of each pixel and replace it with the representative cluster number to achieve unified merging and numbering between clusters.

7. The fast density clustering method for point cloud images based on CUDA acceleration according to claim 1, characterized in that: In step S5, to improve the parallel efficiency and overall processing performance of cluster number updates, a mapping relationship between the original cluster numbers and the merged new cluster numbers is first constructed based on a disjoint-set data structure, and this mapping relationship is cached in the GPU global memory. Subsequently, in a parallel environment, an independent CUDA thread is allocated to each pixel in the merged neighborhood, and its corresponding new cluster number is updated uniformly according to the mapping relationship. Except for the final cluster number update step, all other processes in step S5 involve allocating an independent CUDA thread to each neighborhood for parallel processing.

8. The fast density clustering method for point cloud images based on CUDA acceleration according to claim 1, characterized in that: Through the iterative operation in step S6, the number of neighborhoods in the current image is halved each time the merging operation is performed, which improves the overall parallel processing efficiency and increases the running speed of the clustering process.

9. The fast density clustering method for point cloud images based on CUDA acceleration according to claim 1, characterized in that: Through steps S1 to S6, this method utilizes the CUDA architecture to achieve efficient density clustering of point cloud images in the GPU, significantly improving parallelism and execution speed, and is suitable for large-scale scenarios with high real-time requirements.