A tunnel disease intelligent detection method and system based on cooperation of a UAV swarm
By using swarm collaboration and multi-source data processing, the problems of incomplete data coverage and poor robustness in UAV tunnel inspection have been solved, enabling efficient and accurate tunnel defect detection and management.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- NANJING KENTOP CIVIL ENG TECH CO LTD
- Filing Date
- 2026-06-01
- Publication Date
- 2026-06-30
AI Technical Summary
Existing UAV tunnel detection technologies suffer from incomplete data coverage, limited sensor load, and poor system robustness. A single UAV cannot achieve high-quality data acquisition and continuous detection tasks.
By employing a swarm of drones working collaboratively, the system divides the detection sub-regions, generates collaborative flight paths, and simultaneously collects and binds spatiotemporally marked 2D images and 3D point cloud data. Combined with edge computing and cloud processing, it achieves registration and fusion of multi-source data and utilizes a physical mechanism deep learning model for disease identification and report generation.
It achieves full coverage and blind spot detection of long tunnels, improves detection efficiency, eliminates detection blind spots, improves data accuracy and reliability, reduces equipment energy consumption and operating costs, has fault tolerance and dynamic adaptability, and supports the management of tunnel defect data throughout the entire life cycle.
Smart Images

Figure CN122306702A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of tunnel engineering inspection technology and UAV application technology, and in particular to an intelligent detection method and system for tunnel defects based on UAV swarm collaboration. Background Technology
[0002] As a critical node in transportation networks, the structural health of tunnels is directly related to operational safety. Traditional tunnel defect detection mainly relies on manual inspections or inspection vehicles, which has drawbacks such as low efficiency, high risk, strong subjectivity, and easy omissions.
[0003] In existing technologies, drones have been introduced into tunnel inspection due to their flexibility, but they are mostly limited to single-unit applications or simple aerial photography. However, a single drone faces the following technical problems in application: 1. Incomplete data coverage: A single drone cannot simultaneously acquire high-quality data on the tunnel arch and sidewalls in a single flight, which easily creates blind spots in the inspection; 2. Limited sensor load: A single drone cannot carry multiple large, high-performance sensors at the same time, such as high-resolution cameras and lidar, resulting in a single data source, which is difficult to support refined quantitative analysis of defects; 3. Poor system robustness: A single point of failure will cause the entire inspection task to be interrupted, such as the failure of one drone, which will cause the entire monitoring task to be interrupted.
[0004] No effective solutions have yet been proposed to address the problems in the relevant technologies. Summary of the Invention
[0005] Therefore, it is necessary to provide a method and system for intelligent detection of tunnel defects based on UAV swarm collaboration to address the aforementioned technical problems.
[0006] In a first aspect, the present invention provides an intelligent detection method for tunnel defects based on unmanned aerial vehicle (UAV) swarm collaboration, comprising:
[0007] The tunnel space to be inspected is divided into multiple inspection sub-regions and assigned to corresponding UAV swarm groups; a cooperative flight path with time sequence information is generated based on a unified time reference;
[0008] Based on the cooperative flight path control, a swarm of drones flies into the tunnel and synchronously collects two-dimensional images and three-dimensional point cloud data of the tunnel wall according to a unified time reference, and binds them with a unique spatiotemporal marker.
[0009] The spatiotemporally labeled 2D images and 3D point cloud data are transmitted to edge computing nodes for preprocessing, and then packaged according to spatiotemporal labels and uploaded to the cloud data processing center.
[0010] In the cloud data processing center, multi-source data registration is completed based on spatiotemporal markers, and two-dimensional images are stitched together to generate a panoramic view of the tunnel, and three-dimensional point cloud data is fused to generate complete three-dimensional point cloud data of the tunnel. These are then input into a physical mechanism deep learning model for segmentation and recognition to obtain a disease segmentation map.
[0011] The defect segmentation map and the complete 3D point cloud data of the tunnel are spatially registered based on a unified spatiotemporal reference, and the geometric parameters of various defects are calculated based on the registered 3D point cloud data.
[0012] Based on the geometric parameters, type, location, and machine group detection data of the defects, a structured tunnel defect detection report is generated.
[0013] Furthermore, before dividing the tunnel space to be inspected into multiple inspection sub-regions and assigning them to the corresponding drone swarm groups, the process also includes:
[0014] The system retrieves and parses a global spatiotemporal memory database built based on global spatiotemporal markers, extracts historical detection data of the tunnel to be inspected, generates a heatmap of disease recurrence probability, and pre-allocates UAV swarm detection tasks and pre-plans acquisition strategies based on the heatmap. Specifically, this includes:
[0015] The global spatiotemporal memory of the cloud data processing center is retrieved. If historical inspection task records exist, the tunnel mileage station number is used as the core index to extract historical defect data, machine group data and data quality assessment data of all historical inspection tasks of the tunnel to be inspected.
[0016] Spatiotemporal clustering analysis was performed on historical disease data to identify high-risk areas where diseases continued to expand, as well as low-risk areas with no historical disease records.
[0017] Based on the spatiotemporal clustering analysis results, a heat map of disease recurrence probability matching the three-dimensional model of the tunnel is generated, and the grid coordinates of the heat map are mapped one-to-one with the spatial coordinates of the spatiotemporal markers.
[0018] For high-risk areas, low-risk areas, and areas with no historical detection, corresponding drone swarm detection tasks and data collection strategies are set respectively.
[0019] Furthermore, the tunnel space to be detected is divided into multiple detection sub-regions and assigned to corresponding UAV swarm groups; a cooperative flight path with time-series information is generated based on a unified time reference, including:
[0020] Based on tunnel clearance data, the boundary of the flightable space in the tunnel is defined. According to the performance parameters of the UAV and the parameters of the onboard sensors, the tunnel space is divided into several continuous detection segments along the longitudinal direction. Within a single detection segment, it is divided into an arch sub-region, a left wall sub-region, and a right wall sub-region along the transverse direction.
[0021] Based on the pre-allocation results of UAV swarm detection tasks, the UAV swarm is divided into multiple operation groups, and corresponding groups are assigned to different detection sub-areas to achieve full coverage of the sensor field of view.
[0022] Based on the pre-planning results of the acquisition strategy, the acquisition distance and sensor attitude angle of each UAV are planned, and combined with a unified time reference, a cooperative flight path is generated for a single UAV, which includes waypoint spatial coordinates, flight speed, arrival time, sensor synchronization trigger point and corresponding timestamp.
[0023] A pre-set emergency takeover plan is in place. When any UAV malfunctions, the flight paths of nearby similar UAVs are dynamically adjusted based on the type of malfunction and spatial location. The system then takes over the detection sub-area corresponding to the UAV and regenerates a cooperative flight path that meets the safety interval requirements.
[0024] Furthermore, based on the cooperative flight path control, a swarm of drones flies into the tunnel, synchronously collecting two-dimensional images and three-dimensional point cloud data of the tunnel inner wall according to a unified time reference, and binding unique spatiotemporal markers including:
[0025] The drone swarm is controlled to enter the tunnel according to a coordinated flight path, and real-time communication among drone swarm members is achieved through a local self-organizing network communication link.
[0026] Based on the sensor synchronous trigger points and corresponding timestamps in the collaborative flight path, the UAV's optical camera and lidar are synchronously triggered to collect two-dimensional images and three-dimensional point cloud data of the tunnel wall.
[0027] Each frame of 2D image and each set of 3D point cloud data is bound with a unique spatiotemporal marker. The spatiotemporal marker includes the UAV number, acquisition timestamp, acquisition 3D coordinates, and sensor attitude angle. The spatiotemporal marker rules are completely matched with the global spatiotemporal memory index rules.
[0028] During the data acquisition process, the external parameters of the optical camera and lidar are calibrated in real time, and the spatiotemporal markers corresponding to abnormal data acquisition are broadcast synchronously to nearby drones.
[0029] Furthermore, the spatiotemporally labeled 2D images and 3D point cloud data are transmitted to edge computing nodes for preprocessing, and then packaged according to spatiotemporal labels before being uploaded to the cloud data processing center, including:
[0030] Edge computing nodes receive spatiotemporally marked 2D images and 3D point cloud data transmitted by drone swarms, perform Gaussian denoising and data augmentation preprocessing on the 2D images, and perform statistical filtering and noise reduction processing on the 3D point cloud data.
[0031] When denoising a two-dimensional image, the surface roughness information calculated from the three-dimensional point cloud data at the corresponding spatiotemporal marker position is referenced simultaneously to adaptively adjust the denoising intensity; when filtering the three-dimensional point cloud data, the texture edge information extracted from the two-dimensional image at the corresponding spatiotemporal marker position is referenced simultaneously.
[0032] Edge computing nodes monitor the wireless communication bandwidth within the tunnel and the processor load of the cloud data processing center in real time, and dynamically adjust the depth of preprocessing for 2D images and 3D point cloud data.
[0033] Based on the preprocessing depth adjustment results, the preprocessed 2D images and 3D point cloud data are packaged according to spatiotemporal labels and uploaded to the cloud data processing center.
[0034] Furthermore, in the cloud data processing center, multi-source data registration is completed based on spatiotemporal labels. Two-dimensional images are stitched together to generate a panoramic view of the tunnel, and three-dimensional point cloud data is fused to generate complete three-dimensional point cloud data of the tunnel. These are then input into a deep learning model of physical mechanisms for segmentation and recognition, resulting in a defect segmentation map including:
[0035] Based on spatiotemporal markers, coarse registration of multi-camera images is completed, and fine registration and panoramic stitching are performed on image sequences with overlapping areas to generate a panoramic image of the tunnel.
[0036] Based on spatiotemporal markers, global coarse registration of multi-drone point clouds is completed, and fine registration of point cloud data of adjacent UAVs is performed. After registration, a complete 3D point cloud model of the tunnel is generated by fusion.
[0037] The encoder front end of the original semantic segmentation model is split into image feature branches and point cloud feature branches set in parallel, and a physical mechanism deep learning model with a parallel dual-branch network structure is constructed.
[0038] The image feature branch takes the panoramic image of the tunnel as input, and outputs an optimized texture feature map through multi-scale texture feature extraction and spatial attention module embedding.
[0039] The point cloud feature branch takes the complete 3D point cloud data of the tunnel as input, projects the 3D point cloud data onto the image coordinate system, and generates a depth map and normal vector of the same size. It also uses a symmetrical convolutional pooling structure of the image feature branch to extract geometric features, obtain 3D feature maps at each level, and then generates a 2D geometric feature map with the same dimension as the image feature branch by inverse projection from 3D to 2D.
[0040] An adaptive gated fusion unit is set at the end of the encoder to perform channel-weighted fusion of the optimized texture feature map and the two-dimensional geometric feature map, and output a multimodal fusion feature with both texture discriminative power and geometric discriminative power as the initial input tensor of the decoder; and the total loss function of the tunnel morphological remainder is used as the loss function of the physical mechanism deep learning model.
[0041] The generated disease segmentation map is obtained, and the corresponding tunnel segment entries in the global spatiotemporal memory are matched according to the spatiotemporal marker range corresponding to the covered area to achieve archive storage of the disease segmentation map.
[0042] Furthermore, the image feature branch takes the panoramic image of the tunnel as input and generates an optimized texture feature map through multi-scale texture feature extraction and spatial attention module embedding, including:
[0043] A concatenated convolutional pooling structure corresponding to the encoder of the original semantic segmentation model is adopted to extract multi-scale texture features and obtain two-dimensional feature maps at each level.
[0044] At the skip connection points of each layer in the image feature branch, a spatial attention module based on the physical prior of the tunnel structure is embedded; and based on the physical prior of the tunnel structure and the historical disease data stored in the global spatiotemporal memory, a binarized disease high-incidence probability map of the same size as the input image is generated.
[0045] The high-incidence probability map of the disease is convolved by a standardized Gaussian kernel and then normalized to generate an attention weight map.
[0046] The attention weight map is multiplied element-wise with the corresponding level of the two-dimensional feature map to obtain the optimized texture feature map.
[0047] Furthermore, based on prior knowledge of the tunnel structure's physics and historical defect data stored in the global spatiotemporal memory, a binarized defect high-incidence probability map of the same size as the input image is generated, including:
[0048] Based on the tunnel design parameters and historical disease data stored in the global spatiotemporal memory, a basic probability map is generated, and the panoramic tunnel image spliced by the UAV cluster detection is input into the lightweight disease screening network to obtain an initial disease distribution heat map.
[0049] The basic probability map and the initial disease distribution heat map are weighted and fused to generate a personalized dynamic disease high-incidence probability map for this UAV swarm detection mission.
[0050] The high-incidence probability map of diseases generated from each detection and the corresponding disease annotation data are added to the training dataset for incremental iterative updates of the physical mechanism deep learning model.
[0051] Furthermore, the defect segmentation map and the complete 3D point cloud data of the tunnel are spatially registered based on a unified spatiotemporal reference, and the geometric parameters of various defects are calculated based on the registered 3D point cloud data, including:
[0052] If the defect is a crack, then extract the crack skeleton line from the registered complete 3D point cloud data of the tunnel, and calculate the actual width, length and direction of the crack respectively.
[0053] If the defect is leakage or peeling, calculate the actual area and perimeter of the leakage or peeling.
[0054] By calculating the geometric parameters of the same tunnel within a consecutive preset time period, the expansion rate and volume change of the disease are calculated. Then, using the spatiotemporal marker corresponding to the area where the disease is located as an index, the corresponding entries in the global spatiotemporal memory are matched and updated synchronously, and the disease type, spatial location and detection time are updated.
[0055] Secondly, this invention provides an intelligent tunnel defect detection system based on UAV swarm collaboration, the system comprising:
[0056] The region division module is used to divide the tunnel space to be detected into multiple detection sub-regions and assign them to the corresponding UAV swarm groups; it generates a cooperative flight path with time sequence information based on a unified time reference.
[0057] The data acquisition module is used to control a swarm of drones to fly into the tunnel based on a cooperative flight path, and synchronously collect two-dimensional images and three-dimensional point cloud data of the tunnel wall according to a unified time reference, and bind them with a unique spatiotemporal marker.
[0058] The data processing module is used to transmit spatiotemporally labeled 2D images and 3D point cloud data to edge computing nodes for preprocessing, and then classify and package them according to spatiotemporal labels before uploading them to the cloud data processing center.
[0059] The identification module is used in the cloud data processing center to complete the registration of multi-source data based on spatiotemporal markers, and to stitch two-dimensional images to generate a panoramic view of the tunnel and fuse three-dimensional point cloud data to generate complete three-dimensional point cloud data of the tunnel. These are then input into the physical mechanism deep learning model for segmentation and identification to obtain the disease segmentation map.
[0060] The quantization module is used to spatially register the defect segmentation map with the complete 3D point cloud data of the tunnel based on a unified spatiotemporal reference, and to calculate the geometric parameters of various defects based on the registered 3D point cloud data.
[0061] The reporting module is used to generate structured tunnel defect detection reports based on the geometric parameters, type, location, and machine group detection data of the defects.
[0062] Thirdly, the present invention provides an electronic device including a processor, a storage medium and a computer program, wherein the computer program is stored in the storage medium and, when executed by the processor, implements the above-described device control method.
[0063] Fourthly, the present invention provides a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the above-described device control method.
[0064] The beneficial effects of this invention are as follows:
[0065] 1. By using a swarm of drones to work together, a one-time, uninterrupted, full-coverage inspection of long tunnels can be achieved. The inspection efficiency is significantly improved compared to the traditional mode, the coverage is without blind spots, and the safety hazards of manual tunnel inspection are greatly reduced. At the same time, precise resource allocation further reduces equipment energy consumption and operating costs, and improves the scale and feasibility of inspection operations.
[0066] 2. By simultaneously acquiring data from multiple perspectives and multiple sensors, and relying on spatiotemporal markers, all acquired data can be accurately spatiotemporally anchored, completely eliminating visual blind spots in traditional detection methods such as tunnel arches and sidewall gaps. At the same time, after edge and cloud-based collaborative correlation preprocessing, multi-source data is linked and verified with historical data from a global spatiotemporal memory bank to achieve deep fusion of two-dimensional images and three-dimensional point clouds. This allows the data to have both real-time performance and historical traceability, providing a multi-dimensional and highly reliable data foundation for high-precision quantitative analysis of defects, and significantly improving the reference value of the quantitative results.
[0067] 3. Through the fault emergency takeover mechanism of collaborative flight, the single point of failure does not affect the overall mission. At the same time, it integrates the ability to share the real-time status of the fleet and the degree of suspected defects based on spatiotemporal markers. The UAVs can autonomously complete temporary collaborative adjustments at the edge. It can not only quickly take over the detection sub-area of the faulty UAV, but also carry out multi-aircraft collaborative focused collection in areas with high suspected defects, avoiding data loss or missed detection of sudden defects caused by faults. The system has both the fault tolerance of the overall mission and the adaptability of dynamic operation, and the robustness and operational reliability are doubly enhanced.
[0068] 4. The entire process from task pre-planning to report generation is automated, upgraded to closed-loop automation of memory, prediction, detection, updating and iteration. Relying on the global spatiotemporal memory library, it realizes the full life cycle storage and correlation analysis of tunnel defect data. The report generation can automatically integrate multiple historical data to complete the defect trend analysis and risk level determination. There is no need for manual intervention in historical data comparison and analysis, which greatly reduces the dependence on professional operation and maintenance personnel and the technical threshold. It also realizes the digital accumulation and intelligent application of tunnel defect data, and promotes the deep transformation of tunnel operation and maintenance from simple digital recording to intelligent decision-making. Attached Figure Description
[0069] The accompanying drawings, which are included to provide a further understanding of the invention and form part of this invention, illustrate exemplary embodiments of the invention and are used to explain the invention, but do not constitute an undue limitation of the invention. In the drawings:
[0070] Figure 1 This is a flowchart of an intelligent tunnel defect detection method based on UAV swarm collaboration according to an embodiment of the present invention;
[0071] Figure 2 This is an architecture diagram of a deep learning model for physical mechanism in a tunnel defect intelligent detection method based on UAV swarm collaboration according to an embodiment of the present invention.
[0072] Figure 3 This is an architecture diagram of the image feature branch in an intelligent tunnel defect detection method based on UAV swarm collaboration according to an embodiment of the present invention;
[0073] Figure 4 This is an architecture diagram of the point cloud feature branch in a tunnel defect intelligent detection method based on UAV swarm collaboration according to an embodiment of the present invention;
[0074] Figure 5 This is a schematic diagram of a tunnel defect intelligent detection system based on UAV swarm collaboration according to an embodiment of the present invention;
[0075] Figure 6 This is a schematic diagram of the structure of an electronic device according to an embodiment of the present invention;
[0076] Figure 7 This is a heat map showing the probability of tunnel section defects recurrence at 9.8 km according to an embodiment of the present invention;
[0077] Figure 8 This is a heat map showing the probability of tunnel section defects recurring at a distance of 29.8 kilometers according to an embodiment of the present invention.
[0078] The diagram shows the following modules: 1. Region Division Module; 2. Data Acquisition Module; 3. Data Processing Module; 4. Identification Module; 5. Quantification Module; 6. Reporting Module. Detailed Implementation
[0079] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the invention.
[0080] Please see Figure 1 This paper presents an intelligent detection method for tunnel defects based on UAV swarm collaboration, the method comprising:
[0081] S0. Retrieve and parse the global spatiotemporal memory database built based on global spatiotemporal markers, extract historical detection data of the tunnel to be detected, generate a heat map of disease recurrence probability, and realize the pre-allocation of UAV swarm detection tasks and pre-planning of acquisition strategies based on the heat map.
[0082] In the description of this invention, retrieving and parsing a global spatiotemporal memory database constructed based on global spatiotemporal markers, extracting historical detection data of the tunnel to be detected, generating a heatmap of disease recurrence probability, and pre-planning of UAV swarm detection tasks and acquisition strategies based on the heatmap include:
[0083] S01. Retrieve the global spatiotemporal memory of the cloud data processing center. If historical inspection task records exist, use the tunnel mileage marker as the core index to extract historical defect data, machine group data, and data quality assessment data of all historical inspection tasks of the tunnel to be inspected.
[0084] Specifically, the global spatiotemporal memory adopts a structured storage architecture with a three-level composite index consisting of a unique tunnel engineering identifier, mileage marker, and spatiotemporal tag. The stored historical defect data includes the defect type, geometric parameters, spatial coordinates, risk level, and engineering treatment records of historical inspections; drone cluster data includes the configuration, flight path, acquisition parameters, and equipment failure records of historical inspection drones; and data quality assessment data includes the image clarity, point cloud density, stitching error, and model recognition accuracy of historically acquired images. The trigger condition for the retrieval action is that when the inspection task is initialized, the system automatically matches the unique engineering identifier of the tunnel to be inspected. If no historical inspection task record exists, the system directly outputs a uniformly distributed basic task pre-allocation scheme and industry standard acquisition strategy, without executing the historical data analysis step.
[0085] S02. Perform spatiotemporal clustering analysis on historical disease data to identify high-risk areas where diseases continue to expand, as well as low-risk areas with no historical disease records.
[0086] Specifically, the DBSCAN density clustering algorithm was used to conduct spatiotemporal clustering analysis. The spatial dimension used tunnel mileage station number and cross-sectional lateral circumferential coordinates as clustering features, while the temporal dimension used the detection time interval as a constraint. The clustering neighborhood radius was set to 1m, and the minimum number of neighborhood samples was set to 3. Based on the clustering results, the following were identified: areas with defects appearing in three or more consecutive detections at the same spatial location and with continuously increasing geometric parameters were marked as high-risk areas with continuously expanding defects; tunnel sections with no defect records and no structural deformation in three or more consecutive detections were marked as low-risk areas with no historical defect records; at the same time, tunnel circumferential construction joints, longitudinal construction joints, deformation joints, and sections with abrupt changes in surrounding rock grade were included in the high-risk area range by default.
[0087] S03. Based on the spatiotemporal clustering analysis results, generate a heat map of disease recurrence probability that matches the three-dimensional model of the tunnel, and map the grid coordinates of the heat map to the spatiotemporal marker spatial coordinates one by one.
[0088] Specifically, based on the 3D BIM model of the tunnel, grid cells are divided into 0.1m×0.1m units. Each grid cell corresponds to a unique 3D spatial coordinate on the tunnel lining surface, and the spatial coordinates of the grid cells are mapped one-to-one with the collected 3D coordinates of the global spatiotemporal markers. Based on the spatiotemporal clustering analysis results, a disease recurrence probability value is assigned to each grid cell: 0.7-1.0 for high-risk areas, 0-0.3 for low-risk areas, and 0.3-0.7 for other areas. A pseudo-color heat map, i.e., a disease recurrence probability heat map, is generated based on the grid probability values. The coordinate system of the heat map is completely consistent with the coordinate system of the tunnel mileage station and cross-section.
[0089] S04. For high-risk areas, low-risk areas, and areas with no historical detection, set corresponding drone swarm detection tasks and data collection strategies respectively.
[0090] Specifically, for high-risk areas, at least one drone equipped with a high-resolution optical camera and one drone equipped with LiDAR will be pre-assigned to coordinate coverage. The data acquisition strategy is set as follows: flight speed ≤ 3m / s, image acquisition frame rate ≥ 10 frames / second, and point cloud acquisition density ≥ 200 points / m². 2 Adjacent flight tracks have an overlap rate of ≥30%.
[0091] For low-risk areas, one drone equipped with a high-resolution optical camera is pre-assigned for coverage. The data acquisition strategy is set as follows: flight speed ≤ 5m / s, image acquisition frame rate ≥ 5 frames / second, point cloud acquisition density ≥ 80 points / ㎡, and adjacent track overlap rate ≥ 15%.
[0092] For areas lacking historical monitoring data, pre-allocated standard-configuration drone swarms execute industry-standard data acquisition strategies, with flight speeds ≤4m / s, image acquisition frame rates ≥8 frames / second, and point cloud acquisition densities ≥100 points / m². 2 The overlap rate between adjacent flight paths is ≥20%.
[0093] S1. Divide the tunnel space to be detected into multiple detection sub-regions and assign them to the corresponding UAV swarm groups; generate a collaborative flight path with time sequence information based on a unified time reference.
[0094] In the description of this invention, the tunnel space to be detected is divided into multiple detection sub-regions and assigned to corresponding UAV swarm groups. Generating a cooperative flight path with timing information based on a unified time reference includes:
[0095] S11. Based on tunnel clearance data, define the boundary of the tunnel's flyable space, and according to the UAV's performance parameters and onboard sensor parameters, divide the tunnel space longitudinally into several continuous detection segments; within a single detection segment, divide it laterally into an arch sub-region, a left wall sub-region, and a right wall sub-region.
[0096] Specifically, tunnel clearance data is extracted based on the as-built drawings and 3D BIM model of the tunnel to be inspected, defining the flyable space range as the inner boundary of the tunnel lining and 0.5m outside the tunnel equipment clearance. The length of the longitudinal continuous inspection segment is dynamically adjusted according to the drone's single-flight range and the maximum single-hop coverage distance of the local self-organizing network. The length of a single segment does not exceed 80% of the drone's single-flight range and does not exceed the maximum effective coverage range of the local self-organizing network.
[0097] In addition, within a single detection section, along the circumferential direction of the tunnel cross section, with the highest point of the tunnel arch as 0°, the range of -60° to +60° is divided into the arch sub-region, -180° to -60° is divided into the left wall sub-region, and +60° to +180° is divided into the right wall sub-region.
[0098] S12. Based on the pre-allocation results of the UAV swarm detection task, the UAV swarm is divided into multiple operation groups, and corresponding groups are assigned to different detection sub-areas to achieve full coverage of the sensor field of view.
[0099] Specifically, based on the pre-allocation results of S0 tasks, the drone swarm is divided into two core operational groups: the first group is the image acquisition group, which consists of drones equipped with global shutter high-resolution optical cameras and fixed-focus industrial lenses, and is allocated sub-regions of the entire cross-section arch, left side wall, and right side wall to achieve full-area image coverage of the tunnel lining inner wall; the second group is the point cloud acquisition group, which consists of drones equipped with single-line / multi-line LiDAR, and is preferentially allocated the arch sub-region and the high-risk side wall sub-region marked by S0. After the sub-regions are allocated, geometric verification of the sensor field of view and flight distance ensures that the sensor field of view covers 100% of the allocated sub-regions, with no blind spots, and that the spatial interval between the flight paths of adjacent drones is not less than the preset minimum safety interval.
[0100] S13. Based on the pre-planning results of the acquisition strategy, plan the acquisition distance and sensor attitude angle of each UAV, and combine with a unified time reference to generate a cooperative flight path for a single UAV, which includes waypoint spatial coordinates, flight speed, arrival time, sensor synchronization trigger point and corresponding timestamp.
[0101] Specifically, based on the pre-planning results of the S0 acquisition strategy, and the optimal imaging distance and field of view of the airborne sensors, the optimal flight altitude, horizontal distance to the inner wall of the lining, and sensor pitch / roll angle of each UAV in the corresponding sub-region are calculated to ensure that the sensors are in the optimal acquisition attitude. Taking the unified time reference of BeiDou / GPS + inertial navigation as the core, the waypoints of each UAV are sorted in time sequence to generate a time-series coordinated flight path containing the three-dimensional spatial coordinates of the waypoints, the target flight speed, the arrival timestamp, the sensor synchronization trigger position point, and the trigger timestamp. After the path is generated, a spatiotemporal conflict check is performed to ensure that the spatial straight-line distance between any two UAVs at the same timestamp is not less than the minimum safety interval, and the sensor trigger time difference of multiple UAVs in the same section does not exceed 10ms, thus ensuring the spatiotemporal synchronization of multi-source data.
[0102] S14. A pre-set emergency takeover plan is implemented. When any UAV malfunctions, the flight path of nearby UAVs of the same type is dynamically adjusted according to the fault type and spatial location. The system takes over the detection sub-area corresponding to the UAV and regenerates a cooperative flight path that meets the safety interval requirements.
[0103] Specifically, the present invention pre-sets a three-level fault emergency takeover scheme. The first level fault is sensor failure and abnormal data acquisition. The neighboring UAV with the same type of sensor in the same detection segment takes over. The takeover UAV, after completing its own data acquisition task, adjusts its flight path to cover the detection sub-area of the faulty UAV. The replanned path must meet the requirement of full coverage of the sensor's field of view.
[0104] Level 2 faults are caused by insufficient drone battery life or abnormal positioning. In such cases, a drone in standby mode within the same group will take over and directly replace the faulty drone's flight path and data collection tasks.
[0105] A Level 3 fault is a drone losing contact or experiencing power abnormalities. This immediately triggers hover commands for all drones within the same detection segment. The ground control station then re-plans the task allocation for the remaining detection sub-areas, splitting the tasks among two or more nearby drones that are operating normally, ensuring that the detection tasks are not interrupted. In all takeover scenarios, path replanning must be performed again to re-execute safety interval verification and spatiotemporal synchronization verification to ensure operational safety.
[0106] S2. Based on the cooperative flight path control, the drone swarm flies into the tunnel and synchronously collects two-dimensional images and three-dimensional point cloud data of the tunnel wall according to a unified time reference, and binds them with a unique spatiotemporal marker.
[0107] In the description of this invention, a swarm of unmanned aerial vehicles (UAVs) flies into a tunnel based on a cooperative flight path control system, synchronously acquiring two-dimensional images and three-dimensional point cloud data of the tunnel's inner wall according to a unified time reference, and binding unique spatiotemporal markers, including:
[0108] S21. Control the drone swarm to enter the tunnel according to the coordinated flight path, and realize real-time communication among drone swarm members through the local self-organizing network communication link.
[0109] Specifically, the drone swarm is controlled to enter the tunnel operation area sequentially according to a coordinated flight path. The swarm adopts an industrial-grade WiFi 6 self-organizing network communication architecture, with the ground base station at the tunnel entrance as the core node, and each drone acting as a relay node to extend the communication coverage. For example, the single-hop communication distance is no less than 500m, the communication bandwidth is no less than 100Mbps, and the end-to-end latency is no more than 20ms. The swarm members use the UDP communication protocol to broadcast status information in real time, and the status information update frequency is set to ensure that all drones in the swarm can obtain the flight status and mission progress of the entire swarm in real time.
[0110] S22. Based on the sensor synchronous trigger points and corresponding timestamps in the cooperative flight path, the UAV's optical camera and lidar are synchronously triggered to collect two-dimensional images and three-dimensional point cloud data of the tunnel wall.
[0111] Specifically, a dual synchronization mechanism of hardware triggering and software timestamps is adopted, based on a unified PTP precision time protocol, to synchronize the flight control system and airborne sensor system of all UAVs in the swarm. The flight control system synchronously triggers the sensor synchronous trigger position point and corresponding timestamp according to the preset sensor synchronous trigger position point in the cooperative flight path. When the UAV arrives at the trigger position point and the timestamp matches, it synchronously triggers the optical camera and lidar through the hardware I / O port. During the data acquisition process, if the UAV deviates from its position, the flight control system adjusts the flight speed and trigger timing in real time to ensure that the deviation between the acquisition point and the preset path is kept within the deviation range.
[0112] S23. Bind a unique spatiotemporal marker to each frame of two-dimensional image and each group of three-dimensional point cloud data. The spatiotemporal marker includes the UAV number, acquisition timestamp, acquisition three-dimensional coordinates, and sensor attitude angle. The spatiotemporal marker rules are completely matched with the global spatiotemporal memory index rules.
[0113] Specifically, for each successfully acquired 2D image frame and each set of 3D point cloud data, a unique structured spatiotemporal marker is bound. The spatiotemporal marker adopts a fixed format of 128-bit string, which includes: a 16-bit unique UAV number, a 32-bit millisecond-level UTC acquisition timestamp, 48-bit 3D spatial coordinates of the acquisition point (16 bits each for X / Y / Z axes), 16-bit sensor attitude angles (pitch / roll / yaw), and a 16-bit data check code. The 3D spatial coordinate encoding rules and mileage marker mapping rules of the spatiotemporal marker are completely matched with the indexing rules of the global spatiotemporal memory, ensuring that the acquired data can be directly matched to the corresponding tunnel segment and historical data entry in the global spatiotemporal memory through the spatiotemporal marker. The spatiotemporal marker is bound one-to-one with the acquired data file and stored in the header metadata area of the data file, which cannot be tampered with.
[0114] S24. During the data acquisition process, the external parameters of the optical camera and lidar are calibrated in real time, and the spatiotemporal markers corresponding to abnormal data acquisition are broadcast synchronously to nearby drones.
[0115] Specifically, during the data acquisition process, online extrinsic parameter calibration of the optical camera and lidar is performed every 100m. Using a hand-eye calibration method, based on the image and point cloud matching results of a fixed target within the tunnel, the rotation matrix and translation vector between the camera and lidar are optimized in real time to ensure that the spatial calibration error of the multi-sensor data from the same UAV does not exceed 0.2m. Simultaneously, the acquired data undergoes real-time quality verification. When anomalies such as overexposure / underexposure, insufficient point cloud density, or data packet loss occur, a spatiotemporal marker corresponding to the abnormal data is immediately marked. This marker is then synchronously broadcast to neighboring UAVs via a local ad hoc network. When neighboring UAVs pass through the corresponding spatial location, they automatically supplement the acquisition of image and point cloud data for that area, preventing data loss.
[0116] S3. Transmit the spatiotemporally labeled 2D images and 3D point cloud data to edge computing nodes for preprocessing, and then classify and package them according to spatiotemporal labels before uploading them to the cloud data processing center.
[0117] In the description of this invention, transmitting spatiotemporally marked two-dimensional images and three-dimensional point cloud data to edge computing nodes for preprocessing, and then classifying and packaging them according to spatiotemporal markers before uploading them to a cloud data processing center includes:
[0118] S31. The edge computing node receives two-dimensional images and three-dimensional point cloud data with spatiotemporal markers transmitted by the drone swarm. It performs Gaussian denoising and data augmentation preprocessing on the two-dimensional images and statistical filtering and noise reduction processing on the three-dimensional point cloud data.
[0119] Specifically, edge computing nodes are deployed at ground base stations at tunnel entrances. They receive 2D images and 3D point cloud data with complete spatiotemporal markers via wired / wireless links. After classifying and sorting the data by UAV number and collection timestamp, basic preprocessing is performed: for 2D images, Gaussian denoising is performed using a 5×5 Gaussian kernel to remove image noise, and data augmentation is performed using random flipping and brightness / contrast fine-tuning to improve image robustness.
[0120] For 3D point cloud data, a statistical filtering algorithm is used to remove outliers. The filtering neighborhood is set to 50 points and the standard deviation factor is set to 1.0. At the same time, a pass-through filter is used to remove invalid point cloud data outside the tunnel boundary, thus completing the point cloud noise reduction process.
[0121] S32. When denoising a two-dimensional image, the surface roughness information calculated from the three-dimensional point cloud data at the corresponding spatiotemporal marker position is referenced simultaneously, and the denoising intensity is adaptively adjusted. When filtering the three-dimensional point cloud data, the texture edge information extracted from the two-dimensional image at the corresponding spatiotemporal marker position is referenced simultaneously.
[0122] Specifically, during the correlation preprocessing, the spatial coordinates of the spatiotemporal markers are first used to complete the one-to-one matching of the two-dimensional image pixels and the three-dimensional point cloud data to ensure that the spatial positions of the two are completely corresponding.
[0123] For the denoising of two-dimensional images, the surface roughness of the corresponding three-dimensional point cloud is calculated by the standard deviation of the normal vector of the neighboring points. The number of neighboring points is set to 10. The larger the roughness value, the more uneven the surface and the higher the probability of defects. The mapping relationship between surface roughness and denoising intensity is pre-constructed. When the roughness is ≥0.5, 3×3 small kernel Gaussian denoising is used to reduce the denoising intensity. When the roughness is <0.5, 5×5 large kernel Gaussian denoising is used to increase the denoising intensity.
[0124] For the 3D point cloud filtering stage, the texture edges of the corresponding 2D images are extracted using Canny edge detection technology to mark edge areas such as lining cracks and construction joints. Edge-preserving statistical filtering with a radius of 0.3m is applied to edge areas to retain geometric abrupt changes; conventional statistical filtering with a radius of 0.5m is applied to non-edge areas to improve noise reduction and avoid excessive smoothing of geometric features in the diseased areas.
[0125] S33: Edge computing nodes monitor the wireless communication bandwidth within the tunnel and the processor load of the cloud data processing center in real time, and dynamically adjust the depth of preprocessing for 2D images and 3D point cloud data.
[0126] Specifically, edge computing nodes monitor the wireless communication bandwidth of the self-organizing network in the tunnel in real time through heartbeat packets, as well as the CPU utilization and memory usage of the cloud data processing center; and set trigger thresholds for dynamic adjustment of preprocessing depth: communication bandwidth threshold is 5Mbps, cloud CPU load threshold is 80%, and cloud memory usage threshold is 85%.
[0127] This invention divides the preprocessing depth into three levels: Level 1 is basic preprocessing, which only performs basic denoising and filtering operations in S31; Level 2 is intermediate preprocessing, which performs correlation preprocessing in S32 on the basis of basic preprocessing; Level 3 is deep preprocessing, which performs point cloud voxel downsampling, lossless image compression, and invalid data removal operations on the basis of intermediate preprocessing. The preprocessing depth is dynamically switched according to the real-time monitored bandwidth and cloud load status to balance data transmission efficiency and cloud processing pressure.
[0128] S34. Based on the preprocessing depth adjustment results, the preprocessed 2D image and 3D point cloud data are packaged according to spatiotemporal labels and uploaded to the cloud data processing center.
[0129] Specifically, based on the preprocessing depth adjustment results, the preprocessed 2D images and 3D point cloud data are classified and packaged according to the spatiotemporal marker mileage sequence and UAV number. Each data packet corresponds to the detection data of a continuous 5m mileage in the tunnel. The header of the data packet includes data quality score, spatiotemporal marker range, and preprocessing level information. The data packets are uploaded to the cloud data processing center via 5G / fiber optic link using the TCP / IP transmission protocol. A breakpoint resume mechanism is used during transmission to avoid data loss caused by network fluctuations.
[0130] Once the data packet is uploaded, the cloud immediately performs a data integrity check, determining whether the data is complete by checking the continuity of the spatiotemporal markers and the matching degree of the checksum. If data is missing, a retransmission request is immediately sent to the edge computing node to ensure that all detection data is uploaded completely.
[0131] S4. In the cloud data processing center, multi-source data registration is completed based on spatiotemporal markers. Two-dimensional images are stitched together to generate a panoramic view of the tunnel, and three-dimensional point cloud data is fused to generate complete three-dimensional point cloud data of the tunnel. These data are then input into a physical mechanism deep learning model for segmentation and recognition to obtain a disease segmentation map.
[0132] In the description of this invention, at the cloud data processing center, multi-source data registration is completed based on spatiotemporal markers, and two-dimensional images are stitched together to generate a panoramic view of the tunnel, and three-dimensional point cloud data is fused to generate complete three-dimensional point cloud data of the tunnel. These are then input into a physical mechanism deep learning model for segmentation and recognition, resulting in a defect segmentation map including:
[0133] S41. Based on spatiotemporal markers, perform coarse registration of multi-camera images, and fine registration and panoramic stitching of image sequences with overlapping areas to generate a panoramic image of the tunnel.
[0134] Specifically, after receiving the data packet, the cloud data processing center first performs coarse registration of multi-machine images based on the spatiotemporal markers carried in the data. Images collected by multiple UAVs in the same mileage segment and the same cross-sectional area are sorted by timestamp and spatial coordinates to determine the overlapping areas between images. Then, feature points of the images are extracted and mismatched points are removed to complete fine registration between images. Combining visual SLAM technology, the pose information of the image sequence is optimized to eliminate cumulative errors. Image sequences with overlapping areas are fused and stitched together to generate a continuous panoramic image of the tunnel that unfolds longitudinally along the tunnel. After stitching, the pixel resolution of the panoramic image and the stitching error of adjacent images are checked to ensure that they are within the standard range and that there are no stitching defects such as ghosting or misalignment.
[0135] S42. Based on spatiotemporal markers, complete the global coarse registration of multi-drone point clouds and perform fine registration of adjacent UAV point cloud data. After registration, generate a complete 3D point cloud model of the tunnel through fusion.
[0136] Specifically, based on the spatiotemporal labeling of the data, a global coarse registration of multi-UAV point clouds is completed. Point clouds collected by multiple UAVs in the same mileage segment are initially aligned according to spatial coordinates to eliminate global pose deviations. Then, the Iterative Closest Point (ICP) algorithm is used to perform fine registration of the point cloud data of adjacent UAVs, with the number of registration iterations not exceeding 50 and the registration convergence threshold set to 1e-6. After registration, the TSDF fusion algorithm with a signed distance function is used to fuse the multi-view point cloud data to generate a complete and continuous 3D point cloud model of the tunnel. The point cloud density of the fused 3D point cloud model is not less than 100 points / m. 2 The point cloud ranging accuracy is no less than ±2cm, and there are no fusion defects such as holes or layering.
[0137] First, for each point cloud data, a neighborhood point set with a radius of 0.2m is selected, and the unit normal vector of each point is calculated. Then, a 125-dimensional FPFH feature descriptor is generated by statistically analyzing the azimuth and elevation angles between neighboring point pairs to complete feature matching. Using the spatial coordinate alignment result of coarse registration as the initial pose, the ICP algorithm is used for iterative optimization. In each iteration, corresponding point pairs are matched by nearest neighbor search, and an error function is constructed, which minimizes the sum of squared Euclidean distances between corresponding points. The rotation matrix R and translation vector t are solved, and the maximum number of iterations is set to 50, with a convergence threshold of 1e-6. Iteration stops when the root mean square error between point clouds is ≤0.02m, completing fine registration.
[0138] Secondly, a TSDF voxel grid with a voxel resolution of 0.05m is constructed for the tunnel space, where each voxel stores a signed distance (TSDF) value and a weight value. Multi-view point clouds are projected frame by frame onto the voxel grid, and the signed distance from each point to the voxel center is calculated, with positive values indicating outside the voxel and negative values indicating inside. The voxel TSDF values are updated using a weighted average based on "weight = 1 / point cloud acquisition distance". For voxels in overlapping regions, the TSDF values are fused using a weighted average based on the observation angle and distance of each viewpoint point cloud. For hollow areas with no effective point clouds in three consecutive voxels, the mean TSDF value of the 8 neighboring voxels is used for interpolation. Finally, all voxel surfaces with TSDF values ≤ 0 are extracted to generate a point cloud density ≥ 100 points / m². 2 A complete 3D point cloud model of a tunnel without voids or layering defects.
[0139] S43. The encoder front end of the original semantic segmentation model is split into image feature branches and point cloud feature branches set in parallel, and a physical mechanism deep learning model with parallel dual-branch network structure is constructed.
[0140] Specifically, such as Figure 2As shown, using U-Net or DeepLabV3+ as the original semantic segmentation model, the encoder-decoder architecture is retained, and the encoder front end is split into image feature branches and point cloud feature branches set in parallel to construct a dual-branch parallel physical mechanism deep learning model.
[0141] Both branches employ a symmetrical encoder hierarchy, consisting of 5-layer downsampling convolutional pooling structures with 3×3 kernels, a stride of 1, and Same padding. The weight parameters of the two branches are not shared. The input to the image feature branch is a 3-channel tunnel panoramic image, while the input to the point cloud feature branch is a 2-channel depth map and normal vector map. The output feature maps of the two branches have completely identical dimensions to ensure dimensionality matching for subsequent fusion.
[0142] S44. The image feature branch takes the panoramic image of the tunnel as input, and outputs an optimized texture feature map through multi-scale texture feature extraction and spatial attention module embedding.
[0143] In the description of this invention, the image feature branch takes the tunnel panoramic image as input and generates an optimized texture feature map through multi-scale texture feature extraction and spatial attention module embedding, including:
[0144] S441. Using a cascaded convolutional pooling structure corresponding to the encoder of the original semantic segmentation model, multi-scale texture features are extracted to obtain two-dimensional feature maps at each level.
[0145] Specifically, such as Figure 3 As shown, the image feature branch adopts a cascaded convolutional pooling structure corresponding to the base model encoder. The input is a 512×512×3 tunnel panoramic image. After 5 layers of downsampling operations, each layer contains 2 3×3 convolutions, 1 ReLU activation function, and 1 2×2 max pooling operation, extracting multi-scale texture features of 64, 128, 256, 512, and 1024 channels in sequence, and obtaining 5 levels of two-dimensional feature maps. The feature map of each level corresponds to the skip connection port of the decoder to realize the transfer of multi-scale features.
[0146] S442. At the jump connection points of each layer in the image feature branch, embed a spatial attention module based on the physical prior of the tunnel structure; and generate a binarized disease high-incidence probability map of the same size as the input image based on the physical prior of the tunnel structure and the historical disease data stored in the global spatiotemporal memory.
[0147] Specifically, spatial attention modules based on the physical prior of the tunnel structure are embedded in the skip connection parts of layers 2-4 of the image feature branch, with each module corresponding to a feature map of a layer.
[0148] The spatial attention module first generates a binary high-incidence probability map of defects, which is the same size as the input feature map of this level, based on the prior physical data of the tunnel structure and the historical defect data stored in the global spatiotemporal memory. The prior physical data of the tunnel structure includes the following: the 120° range of the tunnel arch, the circumferential construction joint, the longitudinal construction joint, and the expansion joint are high-incidence areas of defects, and the corresponding pixel values of the probability map are assigned a value of 1, while the other areas are assigned a value of 0. At the same time, the historical defect recurrence probability of this area in the global spatiotemporal memory is superimposed to the pixel values of the probability map and weighted to correct them, and finally the binary high-incidence probability map of defects is generated.
[0149] In the description of this invention, generating a binarized high-incidence probability map of defects of the same size as the input image, based on prior physical knowledge of the tunnel structure and historical defect data stored in a global spatiotemporal memory, includes:
[0150] S4421. Based on the tunnel design parameters and historical defect data stored in the global spatiotemporal memory, a basic probability map is generated, and the panoramic tunnel image spliced by the UAV cluster detection is input into the lightweight defect screening network to obtain an initial defect distribution heat map.
[0151] Specifically, in the model inference stage, a basic probability map of the same size as the input image is first generated based on the tunnel design parameters, the historical disease locations stored in the global spatiotemporal memory, the recurrence probability, and the treatment records. The grid size of the basic probability map corresponds one-to-one with the pixels of the input image.
[0152] The preliminary panoramic image of the tunnel, stitched together in this detection, is then input into the lightweight MobileNetV3 disease screening network. This network is pre-trained on a tunnel disease dataset. The input image size is 512×512×3, and the output is a single-channel disease confidence heatmap of the same size as the input, i.e., the initial disease distribution heatmap. Each pixel value in the heatmap represents the suspected confidence of the disease at that location, with a value range of 0-1. The inference speed of the screening network is no less than 30 frames / second to ensure that it does not affect the overall detection efficiency.
[0153] The lightweight MobileNetV3 disease screening network of this invention is based on a modified and adapted version of MobileNetV3-Small. Its core architecture includes 16 depthwise separable convolutional layers (3×3 kernel size, alternating strides of 1 / 2), 3 SE channel attention modules (compression rate set to 8), h-swish activation function, and 2×2 max pooling. For tunnel disease detection scenarios, the network classification head is modified: the original classification layer is removed and replaced with a 1×1 convolutional layer + upsampling layer, using bilinear interpolation and a scaling factor of 8, so that the network output dimension is completely consistent with the 512×512 input image.
[0154] During the training phase, a dataset containing 100,000 annotated images of tunnel defects, covering four categories: cracks, seepage, spalling, and background, was used for pre-training. The dataset was divided into training, validation, and test sets in an 8:1:1 ratio. The batch size was set to 32, the initial learning rate to 1e-4, and the AdamW optimizer was used. The loss function was binary cross-entropy loss. The training iterations were 80 epochs. Data augmentation was performed by random cropping (224-512px), random flipping, and brightness / contrast perturbation to prevent overfitting.
[0155] In the inference phase, the input 512×512×3 tunnel panoramic image is first normalized by dividing the pixel value by 255, resulting in a mean of 0.485 and a variance of 0.229. This normalization is then applied to the modified MobileNetV3 network, where multi-scale disease texture features are extracted via depthwise separable convolution. The SE module enhances the feature weights of disease regions. Finally, the 1×1 convolution of the output layer is used to calculate the disease suspicion confidence of each pixel, with a value of 0-1, where 0 represents no disease and 1 represents extremely high suspicion. This directly generates a single-channel initial disease distribution heatmap of the same size as the input.
[0156] S4422. The basic probability map and the initial disease distribution heat map are weighted and fused to generate a personalized dynamic disease high-incidence probability map for this UAV cluster detection task.
[0157] Specifically, the basic probability map and the initial disease distribution heat map are weighted and merged with a weight of 0.4:0.6. The fusion formula is: M fusion = 0.4 × M basic + 0.6 × M initial screening, where M fusion is the fused probability map, M basic is the basic probability map, and M initial screening is the initial disease distribution heat map.
[0158] The fused probability map is smoothed using a 5×5 Gaussian kernel to eliminate discrete jumps. Then, a binarized disease high-incidence probability map is generated by threshold segmentation. The segmentation threshold is set to 0.5, and regions with pixel values ≥ 0.5 are assigned a value of 1, while regions < 0.5 are assigned a value of 0. Finally, a personalized dynamic disease high-incidence probability map is generated for this detection task.
[0159] S4423. Add the high-incidence probability map of diseases generated from each detection and the corresponding disease annotation data to the training dataset for incremental iterative updates of the physical mechanism deep learning model.
[0160] Specifically, during the model training phase, the personalized dynamic disease high-incidence probability map generated from each detection task, along with its corresponding manually labeled disease precision segmentation map, is added to the model's incremental training dataset. The dataset is divided into an incremental training set and a validation set in an 8:2 ratio. An incremental training method using transfer learning is employed, freezing the underlying weights of the model's encoder and only fine-tuning the spatial attention module and the encoder's higher-level weights. The training batch size is set to 16, the initial learning rate is set to 1e-5, and the number of training iterations is set to 20 epochs. Through incremental training, the weight distribution of the spatial attention module is continuously optimized, forming a continuous learning loop of detection, data accumulation, model optimization, and more accurate detection, thereby continuously improving the model's adaptability to different tunnel scenarios.
[0161] S443. Perform convolution operation on the high probability map of disease occurrence using a standardized Gaussian kernel, and generate an attention weight map through normalization processing.
[0162] Specifically, a 5×5 standardized Gaussian kernel with a standard deviation σ=1.0 is used to perform a two-dimensional convolution operation on the binarized disease high incidence probability map, eliminating hard boundaries of the probability map and generating a weight distribution with a natural transition. Then, a Min-Max normalization function is used to normalize the convolutional probability map, mapping the weight values to the [0, 1] interval, and finally generating an attention weight map of the same size as the input feature map. The calculation formula is: ,in, This is a two-dimensional convolution operation. To standardize the Gaussian kernel, For the Min-Max normalization function, This is a binary probability map of disease incidence at the corresponding level.
[0163] S444. Perform element-wise multiplication of the attention weight map with the corresponding level's two-dimensional feature map to obtain the optimized texture feature map.
[0164] Specifically, the generated attention weight map The attention weight map is multiplied element-wise with the two-dimensional feature map Fl1 output by the corresponding level encoder. Before multiplication, it is ensured that the size and number of channels of the attention weight map are completely matched with those of the two-dimensional feature map. The channel dimensions are aligned through a broadcast mechanism.
[0165] The multiplication formula is: ,in This is the optimized texture feature map. This is an element-wise multiplication operation; through this operation, the feature weights of high-incidence disease areas are enhanced, while the feature weights of non-disease areas are suppressed, guiding the model to focus on high-incidence disease areas, thereby improving the targeting of feature extraction and the inference efficiency of the model.
[0166] S45. The point cloud feature branch takes the complete 3D point cloud data of the tunnel as input, projects the 3D point cloud data onto the image coordinate system, and generates a depth map and normal vector of the same size. It also uses a symmetrical convolutional pooling structure of the image feature branch to extract geometric features, obtains 3D feature maps of each level, and then generates a 2D geometric feature map with the same dimension as the image feature branch by inverse projection from 3D to 2D.
[0167] Specifically, such as Figure 4 As shown, the point cloud feature branch takes the complete 3D point cloud data of the tunnel as input. First, the 3D point cloud data is projected onto the image coordinate system through a pinhole camera projection model. The projection process uses the same intrinsic parameter matrix as the optical camera to generate single-channel depth maps and single-channel normal vector maps of the same size as the panoramic image of the tunnel. These are then stitched together to form a 2-channel geometric feature map as the branch input. A 5-layer downsampling convolutional pooling structure, completely symmetrical to the image feature branch, is used with a 3×3 kernel, a stride of 1, and Same padding to independently extract the geometric features of the point cloud, obtaining 3D feature maps of 64, 128, 256, 512, and 1024 channels in sequence. Then, a 3D to 2D inverse projection algorithm is used to map the 3D feature map onto the image coordinate system, generating a 2D geometric feature map of the same size and dimension as the image feature branch. This ensures a perfect match with the output dimensions of the image feature branches.
[0168] S46. An adaptive gating fusion unit is set at the end of the encoder to perform channel-weighted fusion of the optimized texture feature map and the two-dimensional geometric feature map, and output a multimodal fusion feature with both texture discrimination and geometric discrimination as the initial input tensor of the decoder; and the total loss function of the tunnel morphological remainder is used as the loss function of the physical mechanism deep learning model.
[0169] Specifically, at the lowest resolution level at the end of the encoder, an adaptive gated fusion unit is set. The unit contains two parallel fully connected layers, which respectively receive the optimized texture feature maps output from the image feature branches. Two-dimensional geometric feature map output by point cloud feature branch Learnable weights for the features of the two branches are learned through a fully connected layer. , ,in Then, multimodal fusion features are generated through channel-weighted fusion. The fusion formula is as follows: The output fused features have both texture discrimination and geometric discrimination capabilities, and serve as the initial input tensor for the decoder.
[0170] Simultaneously, tunnel morphology constraints are added to the loss function of the base model to construct the total loss function, as shown in the formula: ,in The weighted cross-entropy loss function has weights of 0.4, 0.3, 0.2, and 0.1 for the four target categories: cracks, seepage, peeling, and background. The tunnel morphological constraint loss is calculated as the mean square error between the model-predicted segmentation map and the predicted segmentation map after morphological opening and closing operations of the 3×3 cross-shaped structural elements; λ is the morphological constraint coefficient, with a value range of 0.2-0.5, and an optimal value of 0.3.
[0171] Furthermore, the trained deep learning model of physical mechanisms is obtained through the following methods:
[0172] Two-dimensional images and three-dimensional point cloud data of the tunnel lining surface were collected in advance for training, and the two-dimensional images and three-dimensional point cloud data were preprocessed to obtain preprocessed two-dimensional images and preprocessed three-dimensional point cloud data; the preprocessed two-dimensional images were annotated with a labeling tool to obtain an annotated two-dimensional image.
[0173] The labeled 2D image is input into the first branch of the dual-branch network of the physical mechanism deep learning model to extract texture features, and the preprocessed 3D point cloud data is input into the second branch of the dual-branch network of the physical mechanism deep learning model to extract geometric features. A disease segmentation prediction map is obtained through forward propagation, the loss function of the physical mechanism deep learning model is calculated, and the gradient of the loss function with respect to the parameters of the physical mechanism deep learning model is calculated using the backpropagation algorithm. The model parameters are updated using an optimizer. If the physical mechanism deep learning model converges or reaches the maximum number of iterations, the iteration stops and the trained physical mechanism deep learning model is output; otherwise, the next round of iteration is performed.
[0174] S47. Obtain the disease segmentation map generated by identification, and match the corresponding tunnel segment entry in the global spatiotemporal memory bank according to the spatiotemporal marker range corresponding to the coverage area to realize the archiving and storage of the disease segmentation map.
[0175] Specifically, after obtaining the disease segmentation map output by the model inference, the segmentation map is first post-processed. Small false detection areas are removed by morphological opening and closing operations, the spatial coordinate range of each disease connected domain is extracted, and the corresponding spatiotemporal marker mileage station number and cross-sectional coordinates are matched.
[0176] Then, based on the spatiotemporal marker range corresponding to the disease coverage area, the storage entry of the corresponding tunnel segment in the global spatiotemporal memory is matched, and the disease segmentation map is stored in the entry in PNG lossless format. At the same time, the type, confidence level, and corresponding spatiotemporal marker information of the disease are stored synchronously. After storage, the index directory of the global spatiotemporal memory is updated to ensure that the corresponding disease segmentation map data can be quickly retrieved through tunnel identifier, mileage station number, and spatiotemporal marker.
[0177] S5. Spatial registration is performed between the defect segmentation map and the complete 3D point cloud data of the tunnel based on a unified spatiotemporal reference, and the geometric parameters of various defects are calculated based on the registered 3D point cloud data.
[0178] In the description of this invention, the defect segmentation map and the complete three-dimensional point cloud data of the tunnel are spatially registered based on a unified spatiotemporal reference, and the geometric parameters of various defects are calculated based on the registered three-dimensional point cloud data, including:
[0179] S51. If the defect is a crack, extract the crack skeleton line from the registered complete 3D point cloud data of the tunnel, and calculate the actual width, length and direction of the crack respectively.
[0180] Specifically, if the type of damage is cracks, firstly, based on the spatial coordinates of the damage segmentation map, all point clouds belonging to the crack category are segmented from the registered complete 3D point cloud data of the tunnel to form crack point cloud clusters; then, the least squares method is used to perform local plane fitting on the crack point cloud clusters, with the mean square error of the fitting plane not exceeding 0.1m, and the crack point cloud clusters are projected onto the local fitting plane to generate a 2D binary crack image.
[0181] The Zhang-Suen image thinning algorithm is used to thin the binary crack image, extract the crack center line with a width of one pixel, i.e. crack skeleton line, and then map the two-dimensional skeleton line back to three-dimensional space to generate a three-dimensional crack skeleton line.
[0182] Sampling points are set at 0.05m intervals along the three-dimensional crack skeleton line. At each sampling point, a point cloud profile perpendicular to the skeleton line is intercepted along the normal direction of the skeleton line. The straight-line distance between the two edge points of the crack within the profile is calculated as the crack width at that sampling point. The overall crack width is taken as the maximum value and average value of the widths of all sampling points. The total length of the crack is calculated by accumulating the three-dimensional Euclidean distance between adjacent sampling points of the three-dimensional skeleton line. The length calculation error does not exceed 0.1m. The principal component analysis (PCA) algorithm is used to extract the main extension direction of the three-dimensional skeleton line. The spatial azimuth angle between the main extension direction and the longitudinal axis of the tunnel is calculated to determine the direction of the crack.
[0183] S52. If the defect is leakage or peeling, calculate the actual area and perimeter of the leakage or peeling.
[0184] Specifically, if the type of damage is leakage or spalling, the corresponding type of damage point cloud clusters are first segmented from the registered complete 3D point cloud data of the tunnel based on the spatial coordinates of the damage segmentation map.
[0185] To address leakage issues, a 3D convex hull algorithm is used to construct the minimum convex hull of the point cloud cluster in the leakage area, and the surface area of the convex hull is calculated as the actual area of the leakage area. The 3D boundary points of the point cloud cluster in the leakage area are extracted, and the perimeter of the leakage area is calculated by accumulating the 3D Euclidean distance between adjacent boundary points.
[0186] For the spalling disease, the least squares method is used to perform planar fitting on the point cloud of the complete lining around the spalling area to obtain the lining reference plane. The point cloud of the spalling area is projected onto the reference plane, and the area of the projected area is calculated as the actual area of the spalling area. The three-dimensional boundary points of the point cloud cluster of the spalling area are extracted, and the spatial connection length of adjacent boundary points is accumulated to calculate the perimeter of the spalling area.
[0187] S53. By calculating the geometric parameters within a consecutive preset time period in the same tunnel, the expansion rate and volume change of the disease are calculated. Using the spatiotemporal marker corresponding to the area where the disease is located as an index, the corresponding entry in the global spatiotemporal memory is matched and updated, and the disease type, spatial location and detection time are updated synchronously.
[0188] Specifically, for tunnels undergoing periodic inspections, the process begins by retrieving historical 3D point cloud models from the global spatiotemporal memory. A global rigid registration is then performed between the current inspection and historical point cloud models. Next, a thin-plate spline interpolation algorithm is used to compensate for non-rigid deformation of the tunnel lining, eliminating registration errors caused by minor structural deformations. Based on the registered multi-period point cloud data and the geometric parameters of the defects, the change in geometric parameters of the same defect within adjacent inspection cycles is calculated. The ratio of this change to the time interval between two inspections is used as the defect's propagation rate.
[0189] For exfoliation disease, a 0.05m voxel grid was used to construct spatial envelopes of point clouds in multiple exfoliation areas. The volume difference of the envelopes was calculated to obtain the volume change of the disease. After the calculation, the spatiotemporal markers corresponding to the disease area were used as the core index to match and update the calculated disease geometric parameters, expansion rate, and volume change to the corresponding disease entries in the global spatiotemporal memory. The disease type, spatial location, detection time, and risk level association information were updated simultaneously to complete the full data update of the memory.
[0190] S6. Based on the geometric parameters, type, location, and machine group detection data of the defects, generate a structured tunnel defect detection report.
[0191] Specifically, based on the full data of the defects detected in this test in the global spatiotemporal memory bank, the historical test data of the corresponding area, and combined with the geometric parameters, types, spatial locations, and machine group test operation data of the defects, a standardized and structured tunnel defect detection report is automatically generated. The report is in PDF format and supports export in editable Word format. The report content strictly follows the relevant requirements of the "Technical Specification for Highway Tunnel Maintenance" (JTGH12).
[0192] The tunnel defect detection report should include at least the following: 1. Defect List: A tabular list of all identified defects, each entry including: unique ID, defect type, location, geometric parameters, and discovery time. 2. Defect Location Map: Defect segments are marked on the tunnel panoramic view or 3D model to create a visual defect distribution map. 3. Quantitative Analysis Table: Detailed quantitative measurement results for each defect. 4. Development Trend Analysis (if periodic detection): Compare historical data to calculate the expansion rate and area / volume change of key defects. 5. Risk Assessment and Treatment Recommendations: Based on the defect's geometric parameters, type, location, and expansion rate, and according to industry standards or tunnel structural safety models, automatically determine the risk level of each defect; and generate preliminary treatment recommendations based on the risk level. 6. Detection Task Summary: Includes detection time, tunnel section, configuration of the drone fleet used, and data quality assessment.
[0193] Please see Figure 5 Furthermore, a tunnel defect intelligent detection system based on UAV swarm collaboration is provided, which includes:
[0194] The region segmentation module 1 is used to divide the tunnel space to be detected into multiple detection sub-regions and assign them to the corresponding UAV swarm groups. A cooperative flight path with time-series information is generated based on a unified time reference.
[0195] Data acquisition module 2 is used to control a swarm of drones to fly into the tunnel based on a cooperative flight path, and synchronously collect two-dimensional images and three-dimensional point cloud data of the tunnel wall according to a unified time reference, and bind them with a unique spatiotemporal marker.
[0196] Data processing module 3 is used to transmit spatiotemporally labeled 2D images and 3D point cloud data to edge computing nodes for preprocessing, and then classify, package, and upload them to the cloud data processing center according to spatiotemporal labels.
[0197] The identification module 4 is used in the cloud data processing center to complete the registration of multi-source data based on spatiotemporal markers, and to stitch two-dimensional images to generate a panoramic view of the tunnel and fuse three-dimensional point cloud data to generate complete three-dimensional point cloud data of the tunnel. These data are then input into the physical mechanism deep learning model for segmentation and identification to obtain the defect segmentation map.
[0198] Quantization module 5 is used to spatially register the defect segmentation map with the complete 3D point cloud data of the tunnel based on a unified spatiotemporal reference, and to calculate the geometric parameters of various defects based on the registered 3D point cloud data.
[0199] Report module 6 is used to generate structured tunnel defect detection reports based on the geometric parameters, type, location, and machine group detection data of the defects.
[0200] The following detailed description, in conjunction with specific embodiments, provides further explanation of the intelligent detection method and system for tunnel defects based on UAV swarm collaboration designed in this invention.
[0201] This project targets a 20m continuous section containing the core sections of a shield tunnel at mileages 9.80 and 29.80, which has been in operation for five years. The tunnel has a standard circular cross-section with a designed cross-sectional area of 9.1㎡, a designed inner diameter of 3.4m, and a designed perimeter of 10.6814m. During its nearly five years of operation, the tunnel has repeatedly exhibited typical structural defects such as segment cracking, circumferential construction joint leakage, and concrete spalling. In some areas, these defects have shown a tendency to recur even after initial treatment. The technical solution of this invention is needed to achieve accurate prediction of the risk of defect recurrence and optimal allocation of drone inspection resources, enabling efficient and accurate detection of tunnel defects during the operational period.
[0202] The core data support in this embodiment comes from the global spatiotemporal memory database, which has completed the archiving and storage of the tunnel's full-cycle operation and inspection data. The core data includes: tunnel design parameters and 3D BIM model, 3D point cloud model and panoramic image data of the tunnel from the past 5 years of inspections, types, spatial locations, circumferential distributions, recurrence times, treatment records, and development trend data of historical defects, as well as a priori physical characteristics data of the tunnel structure, providing complete data support for the calculation of defect recurrence probability and risk classification.
[0203] This embodiment calculates the probability of disease recurrence for tunnel sections and the entire tunnel area based on historical data from a global spatiotemporal memory, and generates a visual heatmap. The specific implementation steps are as follows:
[0204] 1. Tunnel cross-section grid zoning: Based on the circumferential structural characteristics of the tunnel's circular cross-section, and combined with the segmentation and construction joint distribution, the 9.80 mileage cross-section is divided into 42 independent grid regions along the circumference, and the 29.80 mileage cross-section is divided into 40 independent grid regions along the circumference. Each grid region corresponds to a fixed circumferential position of the tunnel segment, achieving a one-to-one precise binding between the recurrence probability calculation results and the tunnel's spatial position.
[0205] 2. Quantitative Calculation of Disease Recurrence Probability: Based on multi-dimensional historical data from the global spatiotemporal memory, the disease recurrence probability for each grid region is quantitatively calculated. The calculation formula is as follows:
[0206] P = 0.6 × P1 + 0.3 × P2 + 0.1 × P3;
[0207] Among them, P1 is the weight of the frequency of historical disease recurrence, which is assigned a value based on the number of historical disease recurrences in the area; P2 is the physical prior weight of the tunnel structure, which is weighted more for high-incidence areas of disease such as the 120° range of the tunnel arch, circumferential construction joints, longitudinal construction joints, and deformation joints; P3 is the weight of the duration after disease treatment, which is assigned a value based on the operating duration after the last disease treatment in the area. Finally, the probability of disease recurrence in each grid area is calculated, with a value range of 0-100%.
[0208] 3. Heatmap hierarchical visualization mapping: Using the tunnel design cross-section outline and the tunnel 3D point cloud model collected by the UAV of this invention as the background, the calculated recurrence probability of each region is mapped to the corresponding grid region through color hierarchy to generate a heatmap of single-section disease recurrence probability; based on the calculation results of each section in the whole interval, a heatmap of the recurrence probability of disease in the entire longitudinal section of the tunnel is generated simultaneously.
[0209] The recurrence probability grading standard implemented this time is as follows:
[0210] Dark red area: Recurrence probability > 80%, indicating an extremely high-risk area;
[0211] Orange zone: Recurrence probability 50%-80%, a high-risk area;
[0212] Yellow area: Recurrence probability 20%-50%, which is a medium-risk area;
[0213] Green zone: Recurrence probability <20%, which is a low-risk zone.
[0214] like Figure 7 As shown in the figure, using the circular outline of the tunnel design cross-section and a 3D point cloud model as a background, this diagram displays the probability distribution of disease recurrence in 42 grid areas across the 9.80 mileage section. Specifically, the 120° area at the tunnel crown is a deep red area of extremely high risk, corresponding to areas with repeated historical cracks, accumulating 4 recurrences; the construction joint areas on the left and right arch waists are orange areas of high risk, corresponding to areas with a high incidence of historical leakage; the sidewall areas are yellow areas of medium risk, with only sporadic repair traces; and the tunnel floor area is green areas of low risk, with no historical disease records. The attached diagram also includes color legends for each risk level, as well as historical disease types and recurrence counts for high-risk areas. Furthermore, the actual measurement data for this tunnel cross-section are shown in Table 1.
[0215] Table 1 Actual Measurement Data of Tunnel Cross-section
[0216]
[0217] like Figure 8 As shown, this figure uses the circular outline of the tunnel design cross-section and a 3D point cloud model as a background to display the probability distribution of disease recurrence in 39 grid areas across the 29.80-kilometer section. The left arch waist construction joint area is a deep red area of extremely high risk, corresponding to areas with repeated historical leaks and segment spalling, with a cumulative recurrence count of 3 times. The arch crown area is a green area of low risk, with no historical disease records. The attached figure also includes color legends for each risk level, as well as historical disease types and recurrence counts for high-risk areas.
[0218] The heatmap of disease recurrence probability generated in this embodiment is directly used as the core input of the UAV swarm inspection S0 pre-planning module of the present invention, fully implementing the entire technical solution of the present invention for inspection task planning, multi-aircraft collaborative scheduling, and intelligent disease identification. Specific applications are as follows:
[0219] The pre-planning of drone swarm inspections is based on the risk classification results of the heat map, and the differentiated allocation of inspection resources is completed: for the dark red extremely high-risk areas at the 9.80 mile and 29.80 mile sections, priority is given to allocating multi-drone collaborative focusing acquisition resources, scheduling the main inspection drone to complete the high-definition image full coverage acquisition of the area, and simultaneously scheduling detail inspection drones to conduct fixed-point hovering fine-grained shooting of the area, while 3D scanning drones equipped with LiDAR conduct encrypted point cloud acquisition of the area, increasing the point cloud acquisition density of the high-risk area to 3 times that of the regular area, and improving the image resolution to 0.2mm / pixel, ensuring no missed detection and high-precision detection of defects in the high-risk area.
[0220] For orange high-risk areas, the frame rate of image and point cloud acquisition is increased; for yellow and green low-risk areas, standard parameters are used to complete full-area coverage acquisition, maximizing the efficiency of inspection operations while ensuring detection accuracy.
[0221] The heatmap of cross-sectional recurrence probability generated in this embodiment is directly input into the spatial attention module of the physical mechanism deep learning model of this invention to generate a binarized disease high incidence probability map of the same size as the input image. The feature weights of high recurrence probability areas are enhanced, while the feature weights of low-risk areas are suppressed, guiding the model to focus on disease high incidence areas and improving the targeting of disease feature extraction.
[0222] Based on the recurrence probability classification results of the heat map, different inspection cycles are set for different risk areas of the tunnel: the inspection cycle for the dark red extremely high risk area is set to 1 month / time, the inspection cycle for the orange high risk area is set to 3 months / time, and the inspection cycle for the yellow and green low risk areas is set to 6 months / time. This replaces the traditional uniform cycle inspection mode for the entire section and realizes differentiated and refined management of tunnel operation period inspection.
[0223] In summary, by utilizing the technical solutions described above in this invention, a one-time, uninterrupted, full-coverage inspection of long tunnels can be achieved through the collaborative operation of a drone swarm. This significantly improves inspection efficiency compared to traditional methods, ensuring comprehensive coverage without blind spots and greatly reducing the safety hazards of manual tunnel inspections. Simultaneously, precise resource allocation further reduces equipment energy consumption and operating costs, enhancing the scalability of the inspection operation. Through multi-view, multi-sensor synchronous acquisition, and relying on spatiotemporal markers to achieve precise spatiotemporal anchoring of all collected data, visual blind spots in traditional inspections such as tunnel arches and sidewall gaps are completely eliminated. Furthermore, after edge and cloud-based collaborative correlation preprocessing of multi-source data, combined with historical data linkage verification from a global spatiotemporal memory, deep fusion of two-dimensional images and three-dimensional point clouds is achieved. This ensures the data possesses both real-time and historical traceability, providing a multi-dimensional and highly reliable data foundation for high-precision quantitative analysis of defects, significantly enhancing the reference value of the quantitative results. Through a collaborative flight fault emergency takeover mechanism, single-point failures do not affect the overall mission. Simultaneously, it integrates the ability to share real-time status and defect suspicion levels of the drone fleet based on spatiotemporal tags. Drones can autonomously complete temporary collaborative adjustments at the edge, not only quickly taking over the detection sub-area of a faulty drone but also conducting multi-drone collaborative focused data collection in areas with high defect suspicion. This avoids data loss or missed detection of sudden defects due to faults. The system possesses both global mission fault tolerance and dynamic operational adaptability, resulting in enhanced robustness and operational reliability. The entire process from mission pre-planning to report generation is automated, upgraded to a closed-loop automation of memory, prediction, detection, updating, and iteration. Relying on a global spatiotemporal memory library, it achieves full lifecycle storage and correlation analysis of tunnel defect data. Report generation can automatically integrate historical data from multiple periods to complete defect trend analysis and risk level determination, eliminating the need for manual intervention in historical data comparison and analysis. This significantly reduces reliance on professional maintenance personnel and technical barriers, and further realizes the digital accumulation and intelligent application of tunnel defect data, driving a deep transformation of tunnel operation and maintenance from simple digital recording to intelligent decision-making.
[0224] This application also provides an electronic device, such as... Figure 6 As shown, it includes: a processor, and a memory coupled to the processor, the memory being used to store a computer program; the processor being used to execute the computer program stored in the memory, so that the electronic device performs the intelligent tunnel defect detection method and system based on UAV swarm collaboration as described in any of the above embodiments.
[0225] Electronic devices can be computing devices such as desktop computers, laptops, handheld computers, and cloud servers. These electronic devices may include, but are not limited to, processors and memory.
[0226] The processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor can be a microprocessor or any conventional processor. The processor is the control center of the electronic device, connecting various parts of the device via various interfaces and lines.
[0227] The memory can be used to store the computer program, and the processor implements various functions of the electronic device by running or executing the computer program stored in the memory and calling the data stored in the memory.
[0228] The memory may primarily include a program storage area and a data storage area. The program storage area may store the operating system, applications required for at least one function, etc.; the data storage area may store data created based on the use of the mobile phone, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as hard disk, memory, plug-in hard disk, smart media card (SMC), secure digital (SD) card, flash card, at least one disk storage device, flash memory device, or other volatile solid-state storage device.
[0229] This application also provides a computer-readable storage medium. The computer program is stored in the computer-readable storage medium, and when executed by a processor, it can implement the steps of the various method embodiments described above. The computer program includes computer program code, which can be in the form of source code, object code, executable file, or some intermediate form. The computer-readable medium can include any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a portable hard drive, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM), a random access memory (RAM), an electrical carrier signal, a telecommunication signal, and a flexible component distribution medium, etc.
[0230] It should be understood that although the steps in the flowcharts of the accompanying figures are shown sequentially as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the accompanying figures may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times, and their execution order is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the sub-steps or stages of other steps.
Claims
1. A method for intelligent detection of tunnel defects based on UAV swarm collaboration, characterized in that, include: The tunnel space to be inspected is divided into multiple inspection sub-regions and assigned to the corresponding drone swarm groups; Generate a cooperative flight path with timing information based on a unified time reference; Based on the cooperative flight path control, a swarm of drones flies into the tunnel and synchronously collects two-dimensional images and three-dimensional point cloud data of the tunnel wall according to a unified time reference, and binds them with a unique spatiotemporal marker. The spatiotemporally labeled 2D images and 3D point cloud data are transmitted to edge computing nodes for preprocessing, and then packaged according to spatiotemporal labels and uploaded to the cloud data processing center. In the cloud data processing center, multi-source data registration is completed based on spatiotemporal markers, and two-dimensional images are stitched together to generate a panoramic view of the tunnel, and three-dimensional point cloud data is fused to generate complete three-dimensional point cloud data of the tunnel. These are then input into a physical mechanism deep learning model for segmentation and recognition to obtain a disease segmentation map. The defect segmentation map and the complete 3D point cloud data of the tunnel are spatially registered based on a unified spatiotemporal reference, and the geometric parameters of various defects are calculated based on the registered 3D point cloud data. Based on the geometric parameters, type, location, and machine group detection data of the defects, a structured tunnel defect detection report is generated.
2. The intelligent detection method for tunnel defects based on UAV swarm collaboration according to claim 1, characterized in that, Before dividing the tunnel space to be detected into multiple detection sub-regions and assigning them to the corresponding drone swarm groups, the process also includes: The system retrieves and parses a global spatiotemporal memory database built based on global spatiotemporal markers, extracts historical detection data of the tunnel to be inspected, generates a heatmap of disease recurrence probability, and pre-allocates UAV swarm detection tasks and pre-plans acquisition strategies based on the heatmap. Specifically, this includes: The global spatiotemporal memory of the cloud data processing center is retrieved. If historical inspection task records exist, the tunnel mileage station number is used as the core index to extract historical defect data, machine group data and data quality assessment data of all historical inspection tasks of the tunnel to be inspected. Spatiotemporal clustering analysis was performed on historical disease data to identify high-risk areas where diseases continued to expand, as well as low-risk areas with no historical disease records. Based on the spatiotemporal clustering analysis results, a heat map of disease recurrence probability matching the three-dimensional model of the tunnel is generated, and the grid coordinates of the heat map are mapped one-to-one with the spatial coordinates of the spatiotemporal markers. For high-risk areas, low-risk areas, and areas with no historical detection, corresponding drone swarm detection tasks and data collection strategies are set respectively.
3. The intelligent detection method for tunnel defects based on UAV swarm collaboration according to claim 1, characterized in that, The process involves dividing the tunnel space to be detected into multiple detection sub-regions and assigning them to corresponding drone swarm groups. Generating a cooperative flight path with timing information based on a unified time reference includes: Based on tunnel clearance data, the boundary of the flightable space in the tunnel is defined. According to the performance parameters of the UAV and the parameters of the onboard sensors, the tunnel space is divided into several continuous detection segments along the longitudinal direction. Within a single detection segment, it is divided into an arch sub-region, a left wall sub-region, and a right wall sub-region along the transverse direction. Based on the pre-allocation results of UAV swarm detection tasks, the UAV swarm is divided into multiple operation groups, and corresponding groups are assigned to different detection sub-areas to achieve full coverage of the sensor field of view. Based on the pre-planning results of the acquisition strategy, the acquisition distance and sensor attitude angle of each UAV are planned, and combined with a unified time reference, a cooperative flight path is generated for a single UAV, which includes waypoint spatial coordinates, flight speed, arrival time, sensor synchronization trigger point and corresponding timestamp. A pre-set emergency takeover plan is in place. When any UAV malfunctions, the flight paths of nearby similar UAVs are dynamically adjusted based on the type of malfunction and spatial location. The system then takes over the detection sub-area corresponding to the UAV and regenerates a cooperative flight path that meets the safety interval requirements.
4. The intelligent detection method for tunnel defects based on UAV swarm collaboration according to claim 1, characterized in that, The method of using a swarm of drones controlled by a cooperative flight path to fly into the tunnel, synchronously collecting two-dimensional images and three-dimensional point cloud data of the tunnel wall according to a unified time reference, and binding them with unique spatiotemporal markers includes: The drone swarm is controlled to enter the tunnel according to a coordinated flight path, and real-time communication among drone swarm members is achieved through a local self-organizing network communication link. Based on the sensor synchronous trigger points and corresponding timestamps in the collaborative flight path, the UAV's optical camera and lidar are synchronously triggered to collect two-dimensional images and three-dimensional point cloud data of the tunnel wall. Each frame of 2D image and each group of 3D point cloud data is bound with a unique spatiotemporal marker. The spatiotemporal marker includes the UAV number, acquisition timestamp, acquisition 3D coordinates, and sensor attitude angle. The spatiotemporal marker rules are completely matched with the global spatiotemporal memory index rules. During the data acquisition process, the external parameters of the optical camera and lidar are calibrated in real time, and the spatiotemporal markers corresponding to abnormal data acquisition are broadcast synchronously to nearby drones.
5. The intelligent detection method for tunnel defects based on UAV swarm collaboration according to claim 1, characterized in that, The process of transmitting spatiotemporally labeled 2D images and 3D point cloud data to edge computing nodes for preprocessing, classifying and packaging them according to spatiotemporal labels, and uploading them to the cloud data processing center includes: Edge computing nodes receive spatiotemporally marked 2D images and 3D point cloud data transmitted by drone swarms. They perform Gaussian denoising and data augmentation preprocessing on the 2D images and statistical filtering and noise reduction on the 3D point cloud data. When denoising a two-dimensional image, the surface roughness information calculated from the three-dimensional point cloud data at the corresponding spatiotemporal marker position is referenced simultaneously to adaptively adjust the denoising intensity; when filtering the three-dimensional point cloud data, the texture edge information extracted from the two-dimensional image at the corresponding spatiotemporal marker position is referenced simultaneously. Edge computing nodes monitor the wireless communication bandwidth within the tunnel and the processor load of the cloud data processing center in real time, and dynamically adjust the depth of preprocessing for 2D images and 3D point cloud data. Based on the preprocessing depth adjustment results, the preprocessed 2D images and 3D point cloud data are packaged according to spatiotemporal labels and uploaded to the cloud data processing center.
6. The intelligent detection method for tunnel defects based on UAV swarm collaboration according to claim 1, characterized in that, In the cloud-based data processing center, multi-source data registration is completed based on spatiotemporal markers. Two-dimensional images are stitched together to generate a panoramic view of the tunnel, and three-dimensional point cloud data is fused to generate complete three-dimensional point cloud data of the tunnel. These are then input into a deep learning model of physical mechanisms for segmentation and recognition, resulting in a defect segmentation map, including: Based on spatiotemporal markers, coarse registration of multi-camera images is completed, and fine registration and panoramic stitching are performed on image sequences with overlapping areas to generate a panoramic image of the tunnel. Based on spatiotemporal markers, global coarse registration of multi-drone point clouds is completed, and fine registration of point cloud data of adjacent UAVs is performed. After registration, a complete 3D point cloud model of the tunnel is generated by fusion. The encoder front end of the original semantic segmentation model is split into image feature branches and point cloud feature branches set in parallel, and a physical mechanism deep learning model with a parallel dual-branch network structure is constructed. The image feature branch takes the panoramic image of the tunnel as input, and outputs an optimized texture feature map through multi-scale texture feature extraction and spatial attention module embedding. The point cloud feature branch takes the complete 3D point cloud data of the tunnel as input, projects the 3D point cloud data onto the image coordinate system, and generates a depth map and normal vector of the same size. It also uses a symmetrical convolutional pooling structure of the image feature branch to extract geometric features, obtain 3D feature maps at each level, and then generates a 2D geometric feature map with the same dimension as the image feature branch by inverse projection from 3D to 2D. An adaptive gated fusion unit is set at the end of the encoder to perform channel-weighted fusion of the optimized texture feature map and the two-dimensional geometric feature map, and output a multimodal fusion feature with both texture discriminative power and geometric discriminative power as the initial input tensor of the decoder; and the total loss function of the tunnel morphological remainder is used as the loss function of the physical mechanism deep learning model. The generated disease segmentation map is obtained, and the corresponding tunnel segment entries in the global spatiotemporal memory are matched according to the spatiotemporal marker range corresponding to the covered area to achieve archive storage of the disease segmentation map.
7. The intelligent detection method for tunnel defects based on UAV swarm collaboration according to claim 6, characterized in that, The image feature branch takes the tunnel panoramic image as input and generates an optimized texture feature map through multi-scale texture feature extraction and spatial attention module embedding, including: A concatenated convolutional pooling structure corresponding to the encoder of the original semantic segmentation model is adopted to extract multi-scale texture features and obtain two-dimensional feature maps at each level. At the skip connection points of each layer in the image feature branch, a spatial attention module based on the physical prior of the tunnel structure is embedded; and based on the physical prior of the tunnel structure and the historical disease data stored in the global spatiotemporal memory, a binarized disease high-incidence probability map of the same size as the input image is generated. The high-incidence probability map of the disease is convolved by a standardized Gaussian kernel and then normalized to generate an attention weight map. The attention weight map is multiplied element-wise with the corresponding level of the two-dimensional feature map to obtain the optimized texture feature map.
8. The intelligent detection method for tunnel defects based on UAV swarm collaboration according to claim 7, characterized in that, The process of generating a binarized high-incidence probability map of defects, which is the same size as the input image, based on prior physical knowledge of the tunnel structure and historical defect data stored in the global spatiotemporal memory, includes: Based on the tunnel design parameters and historical disease data stored in the global spatiotemporal memory, a basic probability map is generated, and the panoramic tunnel image spliced by the UAV cluster detection is input into the lightweight disease screening network to obtain an initial disease distribution heat map. The basic probability map and the initial disease distribution heat map are weighted and fused to generate a personalized dynamic disease high-incidence probability map for this UAV swarm detection mission. The high-incidence probability map of diseases generated from each detection and the corresponding disease annotation data are added to the training dataset for incremental iterative updates of the physical mechanism deep learning model.
9. The intelligent detection method for tunnel defects based on UAV swarm collaboration according to claim 1, characterized in that, The process of spatially registering the defect segmentation map with the complete 3D point cloud data of the tunnel based on a unified spatiotemporal reference, and calculating the geometric parameters of various defects based on the registered 3D point cloud data, includes: If the defect is a crack, then extract the crack skeleton line from the registered complete 3D point cloud data of the tunnel, and calculate the actual width, length and direction of the crack respectively. If the defect is leakage or peeling, calculate the actual area and perimeter of the leakage or peeling. By calculating the geometric parameters of the same tunnel within a consecutive preset time period, the expansion rate and volume change of the disease are calculated. Then, using the spatiotemporal marker corresponding to the area where the disease is located as an index, the corresponding entries in the global spatiotemporal memory are matched and updated synchronously, and the disease type, spatial location and detection time are updated.
10. A tunnel defect intelligent detection system based on UAV swarm collaboration, used to implement the tunnel defect intelligent detection method based on UAV swarm collaboration as described in any one of claims 1-9, characterized in that, The system includes: The region division module is used to divide the tunnel space to be detected into multiple detection sub-regions and assign them to the corresponding UAV swarm groups; it generates a cooperative flight path with time sequence information based on a unified time reference. The data acquisition module is used to control a swarm of drones to fly into the tunnel based on a cooperative flight path, and synchronously collect two-dimensional images and three-dimensional point cloud data of the tunnel wall according to a unified time reference, and bind them with a unique spatiotemporal marker. The data processing module is used to transmit spatiotemporally labeled 2D images and 3D point cloud data to edge computing nodes for preprocessing, and then classify and package them according to spatiotemporal labels before uploading them to the cloud data processing center. The identification module is used in the cloud data processing center to complete the registration of multi-source data based on spatiotemporal markers, and to stitch two-dimensional images to generate a panoramic view of the tunnel and fuse three-dimensional point cloud data to generate complete three-dimensional point cloud data of the tunnel. These are then input into the physical mechanism deep learning model for segmentation and identification to obtain the disease segmentation map. The quantization module is used to spatially register the defect segmentation map with the complete 3D point cloud data of the tunnel based on a unified spatiotemporal reference, and to calculate the geometric parameters of various defects based on the registered 3D point cloud data. The reporting module is used to generate structured tunnel defect detection reports based on the geometric parameters, type, location, and machine group detection data of the defects.