A floor crack risk assessment method and system based on machine vision

By integrating visible light, infrared thermal imaging, and 3D point cloud data into a machine vision method, the problems of low efficiency in detecting floor cracks and inaccurate risk assessment have been solved, enabling accurate crack identification and risk assessment, and improving the scientific nature of floor safety management.

CN122265752APending Publication Date: 2026-06-23BEIJING XIDA CONSTR SUPERVISION CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING XIDA CONSTR SUPERVISION CO LTD
Filing Date
2026-03-20
Publication Date
2026-06-23

Smart Images

  • Figure CN122265752A_ABST
    Figure CN122265752A_ABST
Patent Text Reader

Abstract

The application provides a floor crack risk assessment method and system based on machine vision, and relates to the technical field of machine vision and structural health monitoring. The method comprises: acquiring image data of a target floor area, performing time and space registration, obtaining registered image data, generating an infrared anomaly confidence map based on the registered infrared thermal image; performing feature fusion on the infrared anomaly confidence map, the visible light image and the three-dimensional point cloud data to generate a fused feature map; based on the fused feature map, identifying and segmenting independent crack instance regions through image segmentation processing; based on each crack instance region, performing feature extraction from the registered image data to obtain a multi-dimensional feature set; inputting the multi-dimensional feature set into a pre-set classification model to obtain a mechanism category of the crack corresponding to the crack instance region; and calculating a risk index based on the mechanism category, the multi-dimensional feature set and environmental data, and mapping the risk index to a risk level.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the fields of machine vision and structural health monitoring technology, and in particular to a machine vision-based method and system for assessing the risk of floor cracks. Background Technology

[0002] Flooring is widely used in industrial plants, warehousing and logistics centers, and its structural integrity is directly related to production and operational safety and traffic flow. Currently, floor crack detection suffers from low efficiency, strong subjectivity, and a high rate of missed detections, making it difficult to meet the needs of projects for refined risk management of floor cracks, and it cannot accurately identify the crack formation mechanism or assess the risk level.

[0003] Therefore, there is an urgent need for a machine vision-based method and system for assessing the risk of floor cracks. Summary of the Invention

[0004] To address the aforementioned technical problems, this application provides a machine vision-based method and system for assessing the risk of floor cracks.

[0005] A first aspect of this application provides a machine vision-based method for assessing the risk of floor cracks, including: Acquire image data of the target ground area and perform temporal and spatial registration to obtain registered image data, which includes: visible light images, infrared thermal images, and 3D point cloud data; Based on the registered infrared thermal image, an infrared anomaly confidence map is generated; The infrared anomaly confidence map, visible light image and 3D point cloud data are fused to generate a fused feature map. Based on the fused feature map, image segmentation processing is used to identify and segment independent crack instance regions; Based on each crack instance region, features are extracted from the registered image data to obtain a multi-dimensional feature set; the multi-dimensional feature set includes: morphological texture features, three-dimensional geometric features, and physical field features. By inputting a multi-dimensional feature set into a pre-defined classification model, the mechanism category of the crack corresponding to the crack instance region is obtained; Based on mechanism categories, multi-dimensional feature sets, and environmental data, risk indicators are calculated and mapped to risk levels.

[0006] A second aspect of this application provides a machine vision-based floor crack risk assessment system, comprising: The data acquisition module is used to acquire image data of the target ground area and perform temporal and spatial registration to obtain registered image data, which includes: visible light images, infrared thermal images and three-dimensional point cloud data; The image generation module is used to generate infrared anomaly confidence maps based on the registered infrared thermal images; The image fusion module is used to fuse infrared anomaly confidence maps, visible light images, and 3D point cloud data to generate a fused feature map. The image segmentation module is used to identify and segment individual crack instance regions based on the fused feature map through image segmentation processing; The feature extraction module is used to extract features from the registered image data based on each crack instance region to obtain a multi-dimensional feature set. The multi-dimensional feature set includes: morphological texture features, three-dimensional geometric features, and physical field features. The feature classification module is used to input multi-dimensional feature sets into a preset classification model to obtain the mechanism category of the crack corresponding to the crack instance region; The risk identification module is used to calculate risk indicators based on mechanism categories, multi-dimensional feature sets, and environmental data, and map them to risk levels.

[0007] A third aspect of this application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the computer program to implement the steps of the above-described machine vision-based method for assessing the risk of floor cracks.

[0008] A fourth aspect of this application provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the above-described machine vision-based method for assessing the risk of floor cracks.

[0009] The beneficial effects of the machine vision-based floor crack risk assessment method and system provided in this application are as follows: This application integrates multi-source data from visible light, infrared thermal imaging, and 3D point cloud, breaking through the limitations of single data. It ensures data consistency through temporal and spatial registration, comprehensively capturing the appearance, 3D morphology, and physical field anomaly information of cracks. By generating infrared anomaly confidence maps and feature fusion strategies, it improves the recognition accuracy of crack areas, effectively distinguishing cracks from interfering targets such as stains and scratches, and achieving accurate segmentation of independent crack instances. Secondly, by extracting multi-dimensional feature sets and combining them with classification models to determine the crack mechanism category, the root cause of crack formation can be clearly identified. By comprehensively considering the mechanism category, multi-dimensional features, and environmental data, risk indicators are calculated and risk levels are mapped, realizing the assessment of crack risk, improving the scientific nature of floor safety management, and further reducing the probability of safety accidents. Attached Figure Description

[0010] Figure 1 A schematic flowchart of a machine vision-based risk assessment method for floor cracks provided in an embodiment of this application; Figure 2 A structural block diagram of a machine vision-based floor crack risk assessment system provided in an embodiment of this application; Figure 3 This is a schematic block diagram of an electronic device provided in an embodiment of this application.

[0011] Figure 4 A schematic diagram of the overall equipment of a machine vision-based floor crack risk assessment system provided in an embodiment of this application. Detailed Implementation

[0012] In the following description, specific details such as particular system architectures and techniques are set forth for illustrative purposes and not for limitation, in order to provide a thorough understanding of the embodiments of this application. However, those skilled in the art will understand that this application may also be implemented in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, apparatuses, circuits, and methods have been omitted so as not to obscure the description of this application with unnecessary detail.

[0013] To make the purpose, technical solution, and advantages of this application clearer, the following will be described in conjunction with the appendix. Figure 1-3 The following is an explanation using specific examples.

[0014] Please refer to Figure 1 , Figure 1 This is a flowchart illustrating a machine vision-based method for assessing the risk of floor cracks according to an embodiment of this application. The method includes: S101: Acquire image data of the target ground area and perform temporal and spatial registration to obtain registered image data, including: visible light image, infrared thermal image and three-dimensional point cloud data.

[0015] In this embodiment, image data refers to visual, thermal, and spatial geometric information about the target floor area acquired through various sensors. Visible light images are captured in the visible spectrum using a standard optical camera and are used to characterize the color, texture, and macroscopic morphology of the floor surface. Infrared thermal images are captured using an infrared thermal imager and characterize the temperature distribution of the floor surface, used to detect areas of abnormal temperature. Three-dimensional point cloud data is acquired using technologies such as LiDAR or structured light scanning to describe the three-dimensional geometric structure of the floor surface.

[0016] After data acquisition, an automatic registration algorithm is used for spatiotemporal alignment. For the two-dimensional spatial registration of visible light images and infrared thermal images, a network based on depth feature matching is used to achieve high-precision alignment of images of different modalities. For the registration of three-dimensional point clouds and two-dimensional images, a spatial calibration object-based method is used. A unified world coordinate system is pre-established, and coordinate mapping is achieved through a calibration board or known three-dimensional feature points. Temporal registration is achieved through the built-in synchronization clock of the acquisition device or an external trigger signal to ensure that multiple data acquired at the same time are strictly aligned in time sequence, thus obtaining the registered image data. Specifically, the two-dimensional registration of visible light and infrared thermal images uses a network based on depth feature matching, with the pre-trained model being LoFTR-DS, the input image resolution adjusted to 1024×1024, and the matching threshold set to 0.2. The registration of three-dimensional point clouds and two-dimensional images uses a spatial calibration object-based method. The calibration board is a checkerboard pattern with a size of 30cm×30cm, and corner detection uses the findChessboardCorners function in OpenCV. Temporal registration is achieved through a hardware synchronization trigger.

[0017] S102: Generate an infrared anomaly confidence map based on the registered infrared thermal image.

[0018] In this embodiment, the infrared anomaly confidence map refers to a two-dimensional map obtained based on infrared thermal image analysis that represents the degree of temperature anomaly on the ground surface, where each pixel value represents the probability or intensity of temperature anomaly in that area.

[0019] Specifically, multi-scale spatial analysis is performed on the registered infrared thermal image to extract temperature gradient maps at different scales. An adaptive temperature difference threshold is set based on the temperature statistical distribution of the entire infrared thermal image (e.g., histogram analysis) to identify candidate anomalous thermal regions. Based on the statistical characteristics of pixels in non-candidate anomalous thermal regions in the infrared thermal image, a Gaussian mixture model is used to establish a background temperature distribution reference model. The statistical deviation between the temperature value of the candidate anomalous thermal region and the background temperature distribution reference model is calculated as the temperature anomalous confidence level for that region. If multi-time-series infrared data is available, active or passive thermal components can be decoupled using Kalman filtering or a heat conduction model to correct the confidence level. If 3D point cloud data is available, the authenticity of the anomalous region can be verified through geometric features (curvature, normal vector divergence), and the confidence levels can be weighted and fused. The temperature anomalous confidence level is then weighted and fused with the multi-scale temperature gradient map to generate an infrared anomalous confidence map.

[0020] S103: Perform feature fusion on the infrared anomaly confidence map, visible light image and 3D point cloud data to generate a fused feature map.

[0021] In this embodiment, the fused feature map refers to a comprehensive feature representation generated by integrating features from multiple modalities such as visible light images, infrared anomaly confidence maps, and three-dimensional point cloud data. It includes richer and more comprehensive information on ground cracks.

[0022] This application employs a multi-level feature fusion strategy. Edge (Canny operator) and texture (Gabor filter) feature maps are extracted from visible light images; gradient feature maps are extracted from infrared anomaly confidence maps; and height maps and surface normal vector maps are extracted from 3D point cloud data. These feature maps are concatenated along the channel dimension and input into a shallow convolutional network (2 layers of 3×3 convolutions, each layer containing BN and ReLU) for initial fusion, yielding the first fused feature map. The visible light image, infrared anomaly confidence map, and depth map generated from the 3D point cloud projection are respectively input into a weight-sharing encoder network (Transformer encoder) to extract three sets of semantically informative intermediate features. A cross-attention module (multi-head attention mechanism) interacts with these three sets of intermediate features, dynamically learning the dependencies between different modalities to generate the intermediate fused feature map. The first fused feature map and the intermediate fused feature map are subjected to preliminary binary segmentation to obtain the first initial segmentation mask and the second initial segmentation mask. The two are then subjected to a logical AND operation to obtain a high-confidence crack core region mask. This mask is used as a spatial attention weight to perform element-wise multiplication on the intermediate fused feature map to enhance the crack region features and suppress the background, thereby generating the final fused feature map.

[0023] S104: Based on the fused feature map, independent crack instance regions are identified and segmented through image segmentation processing.

[0024] In this embodiment, the crack instance region refers to the crack region with independent boundaries and attributes that is identified and separated from the fused feature map by image segmentation technology.

[0025] Specifically, the fused feature map is input into a pre-trained semantic segmentation network (U-Net) to obtain a crack confidence heatmap. The semantic segmentation network adopts a U-Net structure, with the encoder using a ResNet-34 pre-trained model and the decoder using transposed convolutions, with skip connections fusing multi-scale features. The loss function is a weighted sum of DiceLoss and cross-entropy, with weights of 0.7 and 0.3, respectively. The optimizer uses Adam with an initial learning rate of 0.001 and a training cycle of 100 epochs. A confidence threshold is determined based on the statistical distribution of the heatmap (OTSU algorithm), and pixels with a confidence value greater than the threshold are initially identified as crack pixels. Connectivity analysis is performed on the crack pixels to form multiple initial seed point sets. For each seed point set, adaptive region growing is performed in the feature space, considering both feature similarity (Euclidean distance) and spatial adjacency during the growing process. During region growing, the growing boundary pixels are mapped to 3D point cloud data. If the absolute value of the Gaussian curvature of a point exceeds the threshold, it is determined to be a geometric boundary, and growing in that direction is stopped. Each connected region that eventually stops growing is a separate crack instance region.

[0026] S105: Based on each crack instance region, feature extraction is performed from the registered image data to obtain a multi-dimensional feature set; the multi-dimensional feature set includes: morphological texture features, three-dimensional geometric features, and physical field features.

[0027] In this embodiment, the multi-dimensional feature set refers to a collection of various types of features extracted from the crack instance region to describe the crack characteristics. Morphological texture features are extracted from the registered visible light image, such as length, width, aspect ratio, area, perimeter, edge curvature, and texture descriptors like contrast, energy, and entropy calculated based on the gray-level co-occurrence matrix (GLCM). Three-dimensional geometric features are extracted from the registered three-dimensional point cloud data, such as the crack's average depth, maximum depth, profile curvature, cross-sectional area, three-dimensional spatial orientation, and surface roughness (calculated through changes in local point cloud normal vectors). Physical field features are extracted from the registered infrared thermal image and infrared anomaly confidence map, such as the crack region's average temperature, temperature gradient magnitude, temperature standard deviation, and crack activity indices obtained from time-series analysis (e.g., the average power of the active heat source component).

[0028] S106: Input the multi-dimensional feature set into the preset classification model to obtain the mechanism category of the crack corresponding to the crack instance region.

[0029] In this embodiment, the preset classification model is a machine learning model based on Support Vector Machine (SVM) that has undergone a complete training process. The mechanism category refers to the main cause or type of crack formation, including but not limited to shrinkage cracks, load cracks, temperature cracks, and settlement cracks.

[0030] The construction and training methods for classification models specifically include the following steps: Step 1: Building the training dataset: Historical cases of floor cracks were collected to form an initial sample library. Each case includes: registered visible light images, infrared thermal images, 3D point cloud data, and crack mechanism category labels (i.e., true values) determined by structural engineering experts based on crack morphology, generation environment, and historical load history.

[0031] For each case, following step S105 (feature extraction method based on crack instance region), a 20-dimensional feature vector is extracted from its image data. This feature vector integrates morphological texture features, three-dimensional geometric features, and physical field features. Thus, a "feature vector-mechanism category label" data pair is generated for each case. All 5000 such data pairs are randomly divided into training, validation, and test sets in a 7:2:1 ratio.

[0032] Step 2: Model Training and Optimization Support vector machines were chosen as the classifier model, and the kernel function was determined to be the radial basis function.

[0033] Initialize the model parameters, including the regularization parameter C and the kernel coefficient gamma. Their initial values ​​are based on empirical settings and optimized through subsequent cross-validation.

[0034] The model is trained using a training set. The training process employs a sequence minimum optimization algorithm to solve the optimization problem of maximizing the classification margin, with the objective function being a constraint optimization function based on hinge loss.

[0035] During training, a validation set was used to monitor model performance. A grid search method was employed to fine-tune hyperparameters within a predefined parameter space (C values ​​ranging from [0.1, 1, 10], gamma values ​​ranging from ['scale', 'auto']), using validation set classification accuracy as the evaluation metric to select the optimal parameter combination. After optimization, the final model parameters were determined as follows: regularization parameter C = 1.0, kernel coefficient gamma = 'scale'. To improve the model's generalization ability, data augmentation operations were applied to the input training set feature data during the training phase. These included random rotation (angle range ±10 degrees), scaling (scale range 0.9-1.1), and brightness adjustment (variation range ±10%) of the original image data corresponding to the feature vectors. Features were then re-extracted to simulate these data changes.

[0036] An early stopping mechanism is implemented to prevent overfitting. When the validation set accuracy no longer improves for 10 consecutive training iterations, training is stopped, and the model parameters are rolled back to the state where the validation set performance was optimal.

[0037] Step 3: Model Evaluation The finally trained model was evaluated on an independent test set. Evaluation metrics included overall accuracy, precision, recall, and F1 score. The model achieved an average classification accuracy of over 95% for crack mechanism categories, indicating reliable classification ability.

[0038] During the evaluation phase, the 20-dimensional multi-dimensional feature set extracted from the crack instance region to be evaluated according to step S105 is input into the fully trained classification model. The model outputs the probability distribution of the feature vector belonging to each preset mechanism category through forward computation, and takes the category with the highest probability as the final mechanism category determination result for the crack instance.

[0039] S107: Based on mechanism categories, multi-dimensional feature sets, and environmental data, risk indicators are calculated and mapped to risk levels.

[0040] In this embodiment, the risk index refers to a quantitative value calculated by comprehensively analyzing factors such as the mechanism type of cracks, multi-dimensional feature sets, and environmental data. It characterizes the potential hazard of cracks to the integrity and safety of the floor structure. The risk level refers to classifying the risk level of cracks into different levels, such as low risk, medium risk, and high risk, based on the numerical range of the risk index, in order to facilitate differentiated management and maintenance.

[0041] Specifically, based on the crack mechanism category, different weights are assigned to each feature in the multi-dimensional feature set (temperature-related features have higher weights in temperature cracks). Based on environmental temperature and humidity time-series data, the weighted feature set is linearly corrected to characterize the impact of the environment on crack behavior. The corrected feature set and crack activity indicators are input into the risk calculation model to obtain the risk index. Based on preset threshold ranges, risks are categorized as low, medium, high, and emergency, mapping the risk index to specific risk levels. Specifically, the risk calculation model uses a gradient boosting tree with 100 trees, a maximum depth of 5, and a learning rate of 0.1. The crack activity index is obtained by calculating the mean of the power change rate of the active heat source sequence after decoupling from the infrared time-series data. The risk level classification rules are: risk index less than 0.3 is low risk, 0.3-0.6 is medium risk, 0.6-0.9 is high risk, and greater than or equal to 0.9 is emergency risk.

[0042] As can be seen from the above, this application overcomes the limitations of a single data source by integrating multi-source image data such as visible light, infrared thermal, and 3D point cloud, and performing precise spatiotemporal registration. By generating infrared anomaly confidence maps and performing multi-feature fusion, the representational ability of crack features is enhanced, improving the accuracy of crack identification. Furthermore, image segmentation technology is used to identify independent crack instance regions and extract multi-dimensional feature sets for more accurate crack analysis. A classification model is used to identify crack mechanism categories, and risk indicators are calculated based on environmental data, ultimately mapped to risk levels. This effectively solves the problems of low efficiency, strong subjectivity, high false negative rate, and inability to accurately identify mechanisms and assess risks in traditional floor crack detection methods.

[0043] In one embodiment of this application, an infrared anomaly confidence map is generated based on the registered infrared thermal image, including: Multi-scale spatial analysis was performed on the registered infrared thermal images to extract temperature gradient maps at different scales, and candidate abnormal thermal regions were identified based on the temperature difference threshold set according to the overall temperature statistical distribution of the infrared thermal images. Based on the statistical characteristics of pixels in non-candidate anomalous thermal regions in infrared thermal images, a reference model for background temperature distribution is established. The statistical deviation between the temperature values ​​of candidate anomalous thermal regions and the background temperature distribution reference model is calculated and used as the confidence level of temperature anomalies. Infrared anomaly confidence maps are generated based on temperature anomaly confidence scores and temperature gradient maps.

[0044] In this embodiment, multi-scale spatial analysis is performed on the registered infrared thermal image to capture temperature anomaly features of different sizes. Cracks in the ground can appear as small linear anomalies, or as large-area anomalies due to internal voids or material defects. Single-scale analysis methods are insufficient to fully cover these situations. This embodiment uses a Gaussian pyramid combined with the gradient operator (Sobel) for multi-scale spatial analysis. First, a Gaussian pyramid is constructed from the infrared thermal image. Then, the gradient magnitude is calculated at each scale layer to generate a multi-scale temperature gradient map. An adaptive temperature difference threshold is set based on the temperature histogram of the entire infrared image to identify candidate anomalous thermal regions. By analyzing the temperature histogram of the image, the main peak representing the background temperature is identified, and an adaptive temperature difference threshold is set based on the statistical characteristics of this main peak (such as peak position and width).

[0045] Based on the statistical characteristics of pixels in non-candidate anomalous hot regions in infrared thermal images, a background temperature distribution reference model is established to provide an accurate benchmark for normal ground temperature. In this embodiment, to establish a reliable temperature background benchmark, a robust Gaussian mixture model is used to model the temperature values ​​of pixels in non-candidate anomalous hot regions, fitting multiple Gaussian distribution components of the background temperature. Subsequently, the negative log-likelihood (or Mahalanobis distance) between the average temperature of each candidate anomalous hot region and the background Gaussian mixture model is calculated as the confidence level of temperature anomalousness for that region.

[0046] Based on temperature anomaly confidence scores and temperature gradient maps, an infrared anomaly confidence map is generated. The aim is to comprehensively utilize the intensity and spatial variation information of temperature anomalies to create a more comprehensive anomaly region indication map. Temperature anomaly confidence scores provide the probability of an anomaly, while the temperature gradient map helps to accurately locate the anomaly boundaries. One implementation method is weighted fusion. The normalized temperature anomaly confidence map and the temperature gradient map are weighted and summed, where the weights can be determined empirically or through training.

[0047] Through the above technical solution, this application effectively solves the problem of inaccurate identification during the generation of infrared anomaly confidence maps. Multi-scale spatial analysis of the registered infrared thermal image can cover anomaly regions of different sizes, avoiding the omission of small or large cracks at a single scale. Temperature gradient maps at different scales are extracted, and candidate anomaly thermal regions are identified based on a temperature difference threshold set according to the overall temperature statistical distribution, ensuring the objectivity and global adaptability of the threshold setting and reducing subjective errors. A background temperature distribution reference model is established based on pixels from non-candidate anomaly thermal regions, providing a stable normal temperature benchmark and distinguishing environmental noise. The statistical deviation between the temperature value of the candidate region and the background temperature distribution reference model is calculated as the temperature anomaly confidence level, quantifying the degree of anomaly and enhancing the accuracy of the judgment. Finally, an infrared anomaly confidence map is generated based on the temperature anomaly confidence level and the temperature gradient map, integrating confidence and gradient information to generate a more reliable output, thereby improving the accuracy of crack identification.

[0048] In one embodiment of this application, a machine vision-based method for assessing the risk of floor cracks further includes: Using infrared anomaly confidence maps as spatial guides, candidate anomaly hot regions in the infrared anomaly confidence maps are mapped onto 3D point cloud data to locate the corresponding 3D candidate point cloud clusters. Local surface fitting is performed on the 3D candidate point cloud cluster to calculate the surface normal vector of each point in the cluster, and the divergence of the normal vector is calculated based on the directional distribution of the surface normal vector. Calculate the distribution of the local absolute Gaussian curvature of the 3D candidate point cloud cluster, and obtain the average absolute Gaussian curvature of the 3D candidate point cloud cluster; Based on the overall curvature distribution of the registered 3D point cloud data, a curvature anomaly threshold is set. If the confidence level of the temperature anomaly corresponding to the candidate abnormal hot region is greater than the preset temperature anomaly confidence threshold, the normal vector divergence is greater than the preset normal vector divergence threshold, and the absolute value of the average Gaussian curvature is greater than the preset curvature anomaly threshold, then the geometric consistency verification is passed. Based on the results of geometric consistency verification, the temperature anomaly confidence scores of corresponding regions in the infrared anomaly confidence map are weighted and fused to generate the final infrared anomaly confidence map.

[0049] In this embodiment, an infrared anomaly confidence map is used as a spatial guide to map candidate anomalous thermal regions in the infrared anomaly confidence map onto 3D point cloud data, thereby locating the corresponding 3D candidate point cloud clusters. The aim is to establish a spatial correspondence between infrared thermal anomalies and 3D geometric information. The corresponding 3D candidate point cloud clusters are located through a projection transformation from 2D pixel coordinates to the 3D point cloud (based on registration parameters).

[0050] For each 3D candidate point cloud cluster, principal component analysis is used to fit a local surface, and the surface normal vector of each point within the cluster is calculated based on the fitted plane or quadratic surface. By calculating the variance or mean angle of the normal vector in spherical coordinates, the divergence of the normal vector is obtained, which can effectively characterize the degree of surface irregularity.

[0051] Furthermore, the Gaussian curvature of each point is calculated based on the local surface fitting results, the distribution of the absolute value of Gaussian curvature within the cluster is statistically analyzed, and its average value is calculated. By analyzing the curvature distribution (95th quantile) of the overall point cloud data, a curvature anomaly threshold is set.

[0052] Geometric consistency verification is a multi-condition logical judgment process. Only when a candidate region simultaneously meets these geometric and thermal conditions is it considered a real crack region, thus effectively eliminating false positives caused by environmental interference or measurement errors.

[0053] Based on the results of geometric consistency verification, the confidence levels of temperature anomalies in corresponding regions of the infrared anomaly confidence map are weighted and fused to generate the final infrared anomaly confidence map. The purpose of weighted fusion is to adjust the infrared anomaly confidence levels according to the results of geometric verification. For example, regions that pass geometric consistency verification can be assigned higher weights to enhance their confidence levels, while regions that fail verification can be assigned lower weights to suppress their confidence levels, or even reduce them to zero. This fusion method ensures that the final infrared anomaly confidence map not only represents thermal anomaly information but also incorporates reliable geometric structure information, improving the accuracy of crack identification.

[0054] Through the above technical solution, this application solves the false positive problem that may exist in infrared anomaly confidence maps by using a geometric verification mechanism based on 3D point cloud data, thereby improving the accuracy of crack identification. The infrared anomaly confidence map is used as a spatial guide to map candidate anomaly thermal regions onto 3D point cloud data, locating 3D candidate point cloud clusters. This step ensures spatial alignment between the thermal anomaly region and the 3D geometric data. Next, local surface fitting is performed on the 3D candidate point cloud clusters, and surface normal vectors are calculated. The divergence of the normal vectors is calculated based on the direction distribution. This feature identifies geometrically irregular regions by evaluating the dispersion of surface normal vectors, as cracks cause a high degree of divergence in the surface normal vector direction. Then, the distribution of the absolute value of local Gaussian curvature is calculated, and the average absolute value of Gaussian curvature is obtained. This feature uses curvature to quantify the degree of surface bending; crack regions often exhibit high curvature characteristics, thereby further distinguishing between true and false anomalies. A curvature anomaly threshold was set based on the overall curvature distribution of the registered 3D point cloud data. This threshold was validated using multiple threshold conditions, including temperature anomaly confidence, normal vector divergence, and the absolute value of the average Gaussian curvature. Through consistency checks of multi-dimensional geometric and thermal features, this feature ensures that only candidate regions satisfying both thermal and geometric anomalies are confirmed as cracks, effectively filtering false positives caused by environmental factors. Finally, the temperature anomaly confidence map was generated by weighted fusion based on the geometric consistency verification results. This feature enhances the confidence of verified regions and suppresses unverified regions through a weighting mechanism, thereby optimizing the reliability of the overall confidence map.

[0055] In one embodiment of this application, before generating the infrared anomaly confidence map, the method further includes: Registered infrared thermal image sequences of the target ground area at multiple different time points are obtained to form a time series dataset. Based on candidate anomalous hot regions, the curves of their temperature change over time are extracted from the time series dataset and used as temperature evolution curves. A time-series characteristic analysis was performed on the temperature evolution curve to distinguish between the passive thermal response caused by the ambient temperature difference and the active heat source characteristics caused by crack activity, and the results of the time-series characteristic analysis were obtained. Based on the results of time series feature analysis, the confidence level of temperature anomalies corresponding to candidate anomalous thermal regions is corrected.

[0056] In this embodiment, a time-series dataset is constructed by acquiring a sequence of registered infrared thermal images of the target floor area at multiple different time points. The aim is to build an image set that includes temporal information by acquiring infrared thermal images of the target floor area at different time points and performing precise spatial registration on these images. For example, a fixed-installation infrared thermal imager system can be used to automatically capture infrared thermal images of the floor area at preset time intervals, and the images can be registered using a pre-set image feature matching algorithm, ensuring precise spatial alignment of images at different time points. Alternatively, a mobile inspection robot equipped with an infrared thermal imager can be used to periodically scan the floor along a planned path, and automatic image registration can be achieved through the robot's positioning system.

[0057] Based on candidate anomalous thermal regions, the temperature evolution curve is extracted from the time-series dataset, representing the temperature change over time. This involves tracing the temperature change trajectory of each potential temperature anomaly region identified in the initial analysis throughout the entire time-series dataset. By extracting the temperature values ​​of these regions at different time points, a curve characterizing their dynamic temperature changes can be formed, providing a basis for distinguishing the nature of anomalous heat sources. For example, for each candidate anomalous thermal region, the average temperature value in the infrared thermal image at each time point can be calculated, and these values ​​can be arranged in chronological order to form the temperature evolution curve for that region.

[0058] A time-series characteristic analysis is performed on the temperature evolution curve to distinguish between passive thermal responses caused by environmental temperature differences and active heat source characteristics caused by crack activity, yielding the results of the time-series characteristic analysis. This step aims to determine whether temperature anomalies are due to passive responses to external environmental temperature fluctuations or active heat sources generated by internal crack activity by deeply analyzing the patterns and characteristics of the temperature evolution curve. For example, correlation analysis can be used to compare the temperature evolution curves of candidate anomalous thermal regions with the concurrent environmental temperature curves. If the two show a positive correlation, it tends to be judged as a passive thermal response; conversely, if the correlation is low or there is a significant hysteresis effect, an active heat source exists. Furthermore, Kalman filtering can be used for component decoupling to distinguish between passive thermal response and active heat source characteristics.

[0059] Based on the results of time-series feature analysis, the confidence level of temperature anomalies corresponding to candidate anomalous thermal regions is corrected. This involves adjusting the initially calculated confidence level of temperature anomalies according to the analysis results of the previous step. If the analysis results indicate that the temperature anomaly in the region is due to an active heat source, its confidence level can be increased; if it is a passive response caused by environmental temperature differences, its confidence level can be decreased or suppressed. For example, for regions identified as active heat sources, their original temperature anomaly confidence level can be multiplied by a gain coefficient greater than 1; while for regions identified as passive thermal responses, their confidence level can be multiplied by an attenuation coefficient less than 1, or even set to zero. This correction mechanism ensures that the final infrared anomaly confidence map can more accurately characterize the true activity of ground cracks and reduce the interference of environmental factors.

[0060] Through the above technical solution, this application effectively solves the problem that temperature anomalies in infrared thermal images may be affected by environmental temperature differences, making it impossible to distinguish between active heat sources caused by crack activity and passive thermal responses caused by the environment, thus affecting the accuracy of confidence levels. By analyzing the time dimension, a sequence of registered infrared thermal images of the target ground area at multiple different time points is obtained, and temperature evolution curves of candidate anomalous thermal regions are extracted, capturing the dynamic characteristics of temperature changes. Furthermore, by performing temporal feature analysis on the temperature evolution curves, this application can accurately distinguish between passive thermal responses caused by environmental temperature differences and active heat source characteristics caused by crack activity, thereby avoiding misjudging temperature fluctuations caused by environmental factors as crack activity. Finally, based on the results of temporal feature analysis, the confidence level of temperature anomalies corresponding to candidate anomalous thermal regions is corrected, enabling the infrared anomaly confidence map to more accurately characterize the true activity of ground cracks, improving the accuracy and robustness of crack detection.

[0061] In one embodiment of this application, a time-series characteristic analysis is performed on the temperature evolution curve to distinguish between passive thermal response caused by ambient temperature difference and active heat source characteristics caused by crack activity, including: The temperature evolution of candidate anomalous thermal regions is modeled as a state-space model; the states of the state-space model include passive thermal components and active heat source components. The corresponding ambient temperature data sequence in the time series dataset is used as the observation input Kalman filter to recursively estimate the state space model, thereby decoupling the passive heat component sequence and the active heat source component sequence. Calculate the power of the active heat source component sequence, analyze its time variation pattern, and calculate the rate of change; If the power of the active heat source component is greater than the preset power threshold and its rate of change is positive, then the candidate abnormal thermal region is determined to have the characteristics of an active heat source caused by crack activity.

[0062] In this embodiment, the state-space model aims to decompose the temperature evolution of candidate anomalous thermal regions into a passive thermal component caused by the external environment and an internal active heat source component caused by crack activity. The passive thermal component refers to the thermal response of the floor surface caused by changes in external environmental temperature (sunlight, air temperature fluctuations), affecting it through heat conduction, convection, and radiation, characterizing the heat exchange between the floor and the environment. The active heat source component refers to the heat generated by crack activity within the floor (crack propagation, internal friction, and heat release caused by material degradation), representing the characteristics of the cracks themselves as heat sources. In practical applications, a linear state-space model is used, where the state variables include the passive thermal component and the active heat source component, and it is assumed that they evolve linearly. The observed value (i.e., the temperature of the candidate anomalous thermal region) is a linear combination of these two components plus observation noise.

[0063] In this embodiment, a Kalman filter is used for recursive estimation of the state-space model. The ambient temperature data sequence is used as the observation input to achieve precise decoupling between the passive heat component sequence and the active heat source component sequence. Recursive estimation refers to an iterative estimation process where the current estimated value of the system state is calculated based on the previous estimate and the current observation. The purpose of decoupling is to separate the overall temperature evolution signal of the candidate anomalous thermal region into independent passive heat component sequences and active heat source component sequences, thereby accurately identifying the thermal characteristics generated by crack activity. For linear state-space models, a standard Kalman filter can be directly applied for state estimation and noise suppression.

[0064] The power of the active heat source component sequence refers to the rate of energy release or transfer per unit time represented by the active heat source component, quantifying the heat intensity generated by fracture activity. By calculating the power, the thermal intensity of fracture activity can be objectively measured, serving as a key indicator for judging the degree of fracture activity. For example, the instantaneous value or its squared value of the decoupled active heat source component sequence can be directly used as a measure of power. Analyzing its time-varying pattern involves observing the trend of the active heat source component power at different time points, such as continuous growth, stabilization, decline, or fluctuation. The rate of change is a quantitative indicator of this pattern, expressed as the speed at which power changes over time. By analyzing the power change pattern and calculating the rate of change, it is possible to identify whether the fracture is in an active development state. The rate of change can be calculated by performing a first-order difference on the active heat source component power sequence.

[0065] Finally, if the power of the active heat source component is greater than a preset power threshold and its rate of change is positive, the candidate anomalous thermal region is determined to have active heat source characteristics caused by crack activity. The power threshold is a preset value used to distinguish between crack activity with heat source characteristics and background noise or weak, insignificant thermal effects. By setting the power threshold, non-heat source signals can be filtered out, avoiding misclassification of weak heat caused by non-crack activity as active heat source characteristics, thus improving the accuracy of the determination. For example, the power threshold can be set through statistical analysis of a large amount of historical data. A positive rate of change indicates that the power of the active heat source component is increasing over time. Based on the power threshold, the positive rate of change further confirms that the crack activity is active, expanding or intensifying, rather than simply indicating the existence of a stable heat source.

[0066] Through the aforementioned technical solutions, the state-space modeling and Kalman filtering techniques of this application can accurately decouple the temperature evolution of candidate anomalous thermal regions into passive heat component sequences and active heat source component sequences. The state-space model provides a structured framework for separating environmental noise and crack activity signals, avoiding the ambiguity caused by signal confusion in traditional methods. The Kalman filter uses the environmental temperature data sequence as the observation input to recursively estimate the state-space model, enabling dynamic processing of temporal noise and uncertainties, ensuring efficient and robust extraction of active heat source signals under complex environmental interference. Based on this, by calculating the power of the active heat source component sequence and analyzing its time-varying pattern, the rate of change is calculated, which can quantify the heat source intensity and capture the dynamic characteristics of crack activity. If the power of the active heat source component is greater than a preset power threshold and its rate of change is positive, the candidate anomalous thermal region is determined to have active heat source characteristics caused by crack activity. This dual verification mechanism, which screens heat sources based on power thresholds while ensuring that the rate of change is positive to guarantee that the heat source is in a growth state, enhances the reliability of crack activity identification and effectively avoids misjudging passive thermal responses caused by environmental temperature differences as crack activity, thereby improving the accuracy of temperature anomaly confidence correction.

[0067] In one embodiment of this application, the confidence level of temperature anomalies corresponding to candidate anomalous thermal regions is corrected based on the results of time-series feature analysis, including: If a candidate anomalous thermal region is determined to have the characteristics of an active heat source, then the confidence level of the temperature anomaly is positively increased based on the power of the active heat source component. If a candidate anomalous thermal region is determined to be a passive thermal response, the confidence level of the temperature anomaly is suppressed based on the correlation between its temperature change and the ambient temperature.

[0068] In this embodiment, the confidence level of temperature anomalies corresponding to candidate abnormal thermal regions is corrected to improve the ability of temperature anomaly confidence levels to characterize the authenticity of ground crack activity, avoid misjudgments caused by environmental factors or non-crack activity, and thus improve the accuracy of subsequent crack identification and risk assessment. This correction can be implemented through a correction function or lookup table, which adjusts the original temperature anomaly confidence level based on the time series analysis results. For example, a gain factor or suppression factor can be set and multiplied by the original confidence level or a weighted average can be applied.

[0069] If a candidate anomalous thermal region is determined to have active heat source characteristics, the temperature anomaly confidence score is positively increased based on the power magnitude of the active heat source component. When an active heat source generated by crack activity is detected, the anomaly truly caused by the crack is highlighted by increasing its temperature anomaly confidence score. This is achieved by normalizing the power magnitude of the active heat source component and then using it as a gain coefficient to directly multiply the original temperature anomaly confidence score.

[0070] If a candidate anomalous thermal region is determined to be a passive thermal response, the confidence level of the temperature anomaly is suppressed based on the correlation between its temperature changes and the ambient temperature. This step aims to reduce the confidence level of the temperature anomaly when the temperature changes in the anomalous thermal region are mainly influenced by the ambient temperature, thereby reducing the interference of environmental factors on crack identification and avoiding misclassification of non-crack-related temperature fluctuations as crack activity. This is achieved by calculating the Pearson correlation coefficient between the temperature sequence of the candidate anomalous thermal region and the ambient temperature sequence. A higher Pearson correlation coefficient indicates a greater influence from the environment, thus requiring a larger suppression factor. The suppression factor can be designed as 1 - k*(Pearson correlation coefficient), where k is an adjustable parameter, and multiplied by the original confidence level.

[0071] Through the above technical solution, this application solves the problem of inaccurate confidence levels by dynamically correcting the confidence level of temperature anomalies based on the results of time-series feature analysis, thereby improving the accuracy of crack activity judgment. The results of time-series feature analysis ensure that the correction operation directly stems from an in-depth analysis of the temperature evolution curve, avoiding subjective bias and making confidence level adjustments more targeted. If a candidate anomalous thermal region is determined to have active heat source characteristics, the power magnitude of the active heat source component is used as the gain basis. The power magnitude quantifies the heat source intensity, enabling the positive gain to more accurately amplify anomalous signals related to crack activity and highlight high-risk areas. If a candidate anomalous thermal region is determined to be a passive thermal response, suppression is performed based on the correlation between temperature changes and ambient temperature. High correlation indicates that environmental factors dominate, thereby effectively reducing the false judgment confidence level caused by non-crack factors and reducing noise interference. Overall, this conditional correction mechanism optimizes the reliability of the confidence level.

[0072] In one embodiment of this application, feature fusion is performed on infrared anomaly confidence maps, visible light images, and 3D point cloud data to generate a fused feature map, including: Edge and texture feature maps are extracted from the registered visible light image, gradient feature maps are extracted from the infrared anomaly confidence map, and height and surface normal vector feature maps are extracted from the registered 3D point cloud data. Edge and texture feature maps, gradient feature maps, and height and surface normal vector feature maps are concatenated along the channel dimension and then input into a shallow convolutional network for initial fusion to obtain the first fused feature map. Visible light images, infrared anomaly confidence maps, and two-dimensional depth maps generated by projecting three-dimensional point cloud data are respectively input into an encoder network with shared parameters to extract three sets of intermediate semantic features. The three sets of intermediate semantic features are interacted and fused through the cross-attention module to generate an intermediate fused feature map; The first fused feature map is subjected to preliminary binary segmentation of the crack region to obtain the first initial segmentation mask; Preliminary binary segmentation of the crack region is performed on the intermediate fusion feature map to obtain the second initial segmentation mask; Perform a logical AND operation between the first initial segmentation mask and the second initial segmentation mask to obtain the core region mask of the crack. Using the core region mask of the crack as the spatial attention weight, feature enhancement and background suppression are performed on the intermediate fusion feature map to generate the final fusion feature map.

[0073] In this embodiment, edge and texture feature maps are extracted from the registered visible light image to capture the morphological details of the cracks, leveraging the high resolution of the visible light image. Gradient feature maps are extracted from the infrared anomaly confidence map to highlight temperature gradient changes in potential crack regions based on temperature anomaly information from the infrared data, thereby enhancing sensitivity to crack activity. This can be achieved by directly applying gradient operators to the infrared anomaly confidence map. Simultaneously, height and surface normal vector feature maps are extracted from the registered 3D point cloud data to provide information on crack depth, surface orientation, and structural context based on 3D geometric properties, helping to distinguish cracks from a smooth background. This can be achieved by projecting the Z-coordinate of the 3D point cloud data onto a 2D mesh to generate a height map; while the surface normal vectors can be estimated by performing planar fitting on the neighborhood of local points in the point cloud data, and then projecting the components or angles of these normal vectors onto the 2D image plane.

[0074] Edge and texture feature maps, gradient feature maps, and height and surface normal vector feature maps are concatenated along the channel dimension, stacking feature maps from different sources to form a multi-channel feature representation. For example, feature vectors can be extracted from different modalities first, then these vectors are concatenated and reconstructed into a multi-channel feature map. Subsequently, the concatenated feature map is input into a shallow convolutional network for preliminary fusion, obtaining a first fused feature map. This shallow convolutional network includes a small number of convolutional layers, whose function is to learn the preliminary interaction relationships between these concatenated features, reduce feature redundancy, and extract basic combination patterns. For example, this network can consist of one or two convolutional layers, supplemented by an activation function (ReLU) and batch normalization layers.

[0075] Furthermore, visible light images, infrared anomaly confidence maps, and 2D depth maps generated from the projection of 3D point cloud data are input into a shared-parameter encoder network to extract three sets of intermediate semantic features. The shared-parameter encoder network ensures consistency in feature extraction from different modalities, resulting in higher-level, more abstract semantic features while maintaining their comparability. For example, a Transformer-based encoder can be used, where self-attention mechanisms independently learn features from each modality, but the initial embedding layers share parameters. The three sets of intermediate semantic features are interacted and fused through a cross-attention module to generate an intermediate fused feature map. The cross-attention module allows features from one modality to influence the processing of features from other modalities, dynamically weighting the importance of features from different sources, enhancing relevance, and suppressing noise. For example, a multi-head cross-attention mechanism can be employed, where the query, key, and value matrices come from different feature sets.

[0076] Next, preliminary binary segmentation of the crack regions is performed on the first fused feature map to obtain a first initial segmentation mask. This step aims to quickly locate potential crack regions based on the preliminary fusion result. For example, this can be achieved by applying a simple thresholding operation (Otsu method) to the output of the classification layer (Sigmoid activation layer) of the first fused feature map. Simultaneously, preliminary binary segmentation of the crack regions is performed on the intermediate fused feature map to obtain a second initial segmentation mask. For example, a fully connected layer or a 1x1 convolution can be used to predict the crack probability of each pixel on the feature map before thresholding.

[0077] A logical AND operation is performed between the first and second initial segmentation masks to obtain the crack core region mask. The purpose of this operation is to identify regions that are classified as cracks with high confidence by both initial segmentation results, thereby effectively reducing false positives and focusing on high-confidence crack regions. For example, a pixel-level logical AND operation can be performed on the two binary masks. Finally, using the crack core region mask as spatial attention weights, feature enhancement and background suppression are applied to the intermediate fusion feature map to generate the final fusion feature map. This step utilizes the high-confidence crack region mask to guide the refinement of the feature map, amplifying features within the identified crack core region and suppressing features in the background region, thus providing a more focused and robust feature representation for subsequent crack segmentation. For example, this can be achieved by performing element-wise multiplication between the intermediate fusion feature map and the crack core region mask, which effectively zeros or reduces features outside the core region.

[0078] Through the above technical solution, this application effectively solves the problems of feature redundancy, background noise interference, and blurring of the crack core region that may result from directly fusing multi-source data. First, by extracting edge and texture feature maps, gradient feature maps, and height and surface normal vector feature maps from visible light images, infrared anomaly confidence maps, and 3D point cloud data respectively, and performing preliminary fusion, the complementary information of each modality can be fully utilized, providing multi-dimensional basic features for crack identification. Second, by extracting intermediate semantic features through a shared-parameter encoder network and using a cross-attention module for interaction and fusion, more abstract global semantic information can be captured, and feature importance can be dynamically weighted, thereby enhancing relevant features and suppressing irrelevant noise. More importantly, this application performs preliminary binary segmentation on the preliminary fused feature map and the intermediate fused feature map respectively to obtain a first initial segmentation mask and a second initial segmentation mask, and performs a logical AND operation on the two to generate a crack core region mask. This mechanism can effectively filter out low-confidence false detection regions and accurately locate the core region of the crack. Finally, using the core region mask of the crack as the spatial attention weight, feature enhancement and background suppression are performed on the intermediate fusion feature map, so that the final fusion feature map can be highly focused on the crack area, improving the focusing ability of the fusion feature on the core region of the crack. This provides high-quality, high signal-to-noise ratio input for subsequent crack instance segmentation, effectively reducing the risk of missed detection and improving the accuracy and reliability of the risk assessment of ground cracks.

[0079] In one embodiment of this application, a logical AND operation is performed between a first initial segmentation mask and a second initial segmentation mask to obtain a crack core region mask, including: Perform a logical AND operation between the first initial segmentation mask and the second initial segmentation mask to obtain the first core region mask; Using the first core region mask as the seed region, region growing is performed on the intermediate fusion feature map to obtain the grown region mask; Morphological optimization of the growth region mask is performed, including removing isolated regions with an area smaller than a preset area threshold and filling holes to obtain an optimized crack core region mask.

[0080] In this embodiment, the first initial segmentation mask and the second initial segmentation mask are preliminary crack region prediction results, each containing some false positives or incomplete regions. Performing a logical AND operation on the first and second initial segmentation masks aims to extract the common parts of the two preliminary segmentation results, thereby obtaining a more reliable and accurate crack core region. This effectively filters out false positives generated by a single segmentation method, generating the first core region mask. This logical AND operation compares the two masks pixel-by-pixel; if a pixel is marked as a crack in both masks (e.g., a pixel value of 1), it is also marked as a crack in the first core region mask.

[0081] Subsequently, using the first core region mask as the seed region, region growing is performed on the intermediate fusion feature map to obtain the grown region mask. Region growing is an image segmentation technique that starts with a set of predefined seed points and gradually expands the region by checking the feature similarity of neighboring pixels. Here, the first core region mask provides reliable crack seed points, while the intermediate fusion feature map provides rich feature information. The role of region growing is to utilize these seed points and feature information to aggregate pixels that are similar in features to the seed points and are spatially adjacent, thereby completely restoring the true shape of the crack and compensating for discontinuities or omissions in the crack region that may be caused by logical AND operations. For example, a feature similarity metric can be defined, and based on spatial proximity, starting from the boundary pixels of the seed region, its neighboring pixels are iteratively checked. If a neighboring pixel satisfies the similarity and proximity conditions, it is added to the current grown region, and the boundary is updated.

[0082] Finally, the growth region mask is morphologically optimized, including removing isolated regions with areas smaller than a preset area threshold and filling holes to obtain the optimized crack core region mask. Removing isolated regions eliminates noise points or small mis-segmented regions generated during region growth; these regions are small and do not belong to the actual crack. Filling holes compensates for any voids or discontinuities within the crack, making the crack region more complete and smooth, and more consistent with the actual crack morphology. Removing isolated regions can be achieved through connected component analysis, identifying all connected regions, calculating the area of ​​each region, and removing regions with areas smaller than the preset area threshold from the mask. Filling holes can be achieved through morphological closing operations.

[0083] Through the above technical solution, this application improves the accuracy and integrity of the crack core region mask by optimizing the generation process, and solves the problems of noise interference and region discontinuity that may be caused by direct logical AND operation. Specifically, the first initial segmentation mask and the second initial segmentation mask are logically ANDed to obtain the first core region mask, retaining only the commonly identified regions, thereby effectively filtering false alarms and background interference and ensuring the reliability of the initial core region. Next, using the first core region mask as the seed region, region growth is performed on the intermediate fusion feature map. This process dynamically expands the boundary using the rich semantic information of the feature map, avoiding the omission of small cracks and enhancing the integrity of the region. Finally, the grown region mask is morphologically optimized, including removing isolated regions with an area smaller than a preset area threshold to remove noise points, and filling holes to smooth the crack region, ultimately generating an optimized crack core region mask that is more accurate and robust.

[0084] In one embodiment of this application, based on the fused feature map, individual crack instance regions are identified and segmented, including: Based on the fused feature map, the crack confidence score of each pixel belonging to the crack category is predicted, and a crack confidence heatmap is generated. The confidence threshold is determined based on the statistical distribution of the crack confidence heatmap, and pixels with a crack confidence greater than the confidence threshold are identified as crack pixels. Spatial clustering is performed on the crack pixels, and the set of spatially connected pixels is used as the initial seed point set; For each initial seed point in the initial seed point set, a region growing operation is performed in the feature space corresponding to the calibrated fused feature map, based on the Euclidean distance of the feature vectors and the spatial proximity relationship between pixels; pixels with similar features and spatially adjacent pixels are aggregated into independent regions. During the region growing process, the boundary pixels of the current growing region are mapped onto the registered 3D point cloud data. If the absolute value of the Gaussian curvature of the point corresponding to the boundary pixel is greater than the preset curvature threshold, it is determined that there is a geometric boundary in that direction, and the growing process stops in that direction. Each independent region that eventually stops growing is the segmented crack instance region.

[0085] In this embodiment, predicting the crack confidence score for each pixel belonging to the crack category and generating a crack confidence heatmap involves using a deep learning model (a semantic segmentation network based on an attention mechanism) to perform pixel-level regression prediction on the fused feature map, thereby obtaining the probability or likelihood that each pixel belongs to the crack category. This model is trained on a large amount of labeled data and can learn the characteristic patterns of cracks. The heatmap visually displays the crack confidence score of each pixel; the brighter the color or the higher the value, the greater the probability that the pixel is a crack.

[0086] Determining the confidence threshold based on the statistical distribution of the crack confidence heatmap means that the threshold is not a fixed value, but rather adaptively adjusted according to the overall characteristics of the current heatmap. For example, a certain percentile (90% or 95%) can be selected as the threshold by analyzing the histogram distribution of the heatmap.

[0087] Identifying pixels with a crack confidence level greater than a confidence threshold as crack pixels refers to the binarization process of converting a continuous crack confidence heatmap into a discrete set of crack pixels. All pixels with a confidence level higher than the determined threshold are marked as crack pixels, while those below the threshold are treated as background. This operation effectively filters out low-confidence noise regions and initially outlines the contours of the cracks.

[0088] Spatial clustering of crack pixels, using spatially connected pixel sets as initial seed sets, refers to combining physically connected or adjacent crack pixels into independent connected regions through Connected Component Analysis (CBI). Since cracks are continuous geometric structures, spatial connectivity analysis can effectively group pixels belonging to the same crack into one category. Each such connected region is considered an initial seed set, representing the starting point of a potential crack instance.

[0089] For each initial seed point in the initial seed point set, a region growing operation is performed within the feature space corresponding to the calibrated fused feature map, based on the Euclidean distance between feature vectors and the spatial proximity between pixels. Aggregating pixels with similar features and spatial adjacency into independent regions means starting from a set of seed points and gradually expanding the region by merging adjacent pixels with similar features. The feature space refers to the multi-dimensional feature vector corresponding to each pixel in the fused feature map. Euclidean distance is used to quantify the similarity between feature vectors; the smaller the distance, the more similar the features. Spatial proximity ensures that only physically adjacent pixels are merged. Iterative region growing can aggregate pixels belonging to the same crack instance that have consistent features and are spatially continuous, forming a complete independent region.

[0090] During region growing, the boundary pixels of the current growing region are mapped onto the registered 3D point cloud data. If the absolute value of the Gaussian curvature of the point corresponding to the boundary pixel is greater than a preset curvature threshold, a geometric boundary is determined to exist in that direction, and the growing process stops in that direction. This means that 3D geometric information serves as the stopping condition for region growing. Gaussian curvature is a geometric quantity describing the degree of local bending of a surface. The larger its absolute value, the more severe the bending of the surface at that point, corresponding to an edge or structural abrupt change in the object. Mapping the boundary pixels in the 2D image onto the 3D point cloud data allows us to obtain its true 3D geometric information. When the absolute value of the Gaussian curvature of the point corresponding to the boundary pixel exceeds the preset curvature threshold, it indicates that a structural edge of the floor exists in that direction. At this point, growth should stop in that direction to avoid growing cracked areas into non-cracked areas or confusing them with other structures, thus ensuring the accuracy and independence of the segmentation. Gaussian curvature can be calculated by fitting a local surface, and the curvature threshold can be determined by statistical analysis of the curvature distribution of typical floor cracks and non-cracked areas.

[0091] Each independent region that eventually stops growing is a segmented crack instance region. This means that after the series of steps mentioned above, including confidence prediction, threshold segmentation, spatial clustering, and 3D geometric constraint region growing, each independent and complete set of pixels is finally identified as an independent crack instance region. These regions are consistent in both feature space and geometric space and have clear boundaries with other regions.

[0092] Through the above technical solutions, this application can accurately identify and segment independent crack instance regions, effectively solving the problems of insufficient accuracy and blurred boundaries in traditional segmentation methods. Based on the fusion feature map, crack confidence is predicted and a heatmap is generated, providing a quantitative and objective basis for crack segmentation and reducing errors caused by subjective judgment. By using an adaptive confidence threshold, crack pixels can be screened more accurately, reducing missed detections and false positives. Furthermore, spatial clustering of crack pixels ensures that region growth starts from practically meaningful connected regions, avoiding growth from isolated noise points. During region growth, based on the Euclidean distance of feature vectors and the spatial proximity relationship between pixels, pixels with similar features and spatial continuity can be effectively aggregated into independent crack instances. Using the Gaussian curvature in the 3D point cloud data as the stopping condition for region growth ensures that the segmentation boundary accurately corresponds to the real geometric structure of the floor, effectively avoiding over-segmentation or under-segmentation, and ensuring that each crack instance region has independence and clear boundaries. This further improves the overall accuracy and reliability of floor crack risk assessment.

[0093] In one embodiment of this application, for each initial seed point in the initial seed point set, a region growing operation is performed in the feature space corresponding to the calibrated fused feature map, based on the Euclidean distance of the feature vectors and the spatial proximity relationship between pixels, including: Calculate the statistical distribution of crack confidence for all pixels in the crack confidence heatmap, and determine the feature distance threshold for judging feature similarity based on the preset quantiles; During the region growing process, for each pixel to be examined on the current growing boundary, calculate the Euclidean distance between it and the average feature vector of all seed points in the current growing region; If the Euclidean distance is less than the feature distance threshold, and the pixel to be examined is adjacent to any pixel in the current growth region in the image space, then the pixel to be examined will be incorporated into the current growth region, and the growth boundary will be updated. The above process is repeated iteratively until there are no pixels on the boundary of the current growth region that meet the merging condition, at which point the growth process of the current growth region terminates.

[0094] In this embodiment, the statistical distribution of crack confidence scores for all pixels in the crack confidence heatmap is calculated to comprehensively understand the overall distribution of pixel confidence scores in the crack confidence heatmap. A histogram of crack confidence scores for all pixels in the crack confidence heatmap can be calculated to visualize its distribution.

[0095] The feature distance threshold for judging feature similarity is determined based on preset quantiles. This step utilizes the characteristics of statistical distribution to dynamically set the judgment criteria for pixel similarity. Specifically, based on the statistical distribution of the crack confidence heatmap, a relatively high quantile (e.g., the 90th quantile) can be selected as the initial feature distance threshold to ensure rapid expansion in the early stages of growth. This quantile-based threshold setting method allows the threshold to adaptively adapt to the characteristics and noise levels of different images, improving the accuracy and robustness of similarity judgment.

[0096] During region growing, for each pixel to be examined on the current growing boundary, the Euclidean distance between it and the average feature vector of all seed points within the current growing region is calculated. This step aims to accurately quantify the feature similarity between the pixel to be examined and the entire growing region. Specifically, in addition to using Euclidean distance, other distance metrics such as Mahalanobis distance can also be used to calculate the similarity between feature vectors. By comparing with the average feature vector of the region, rather than just with a single seed point or boundary point, the influence of local noise can be effectively reduced, enhancing the stability of similarity judgment.

[0097] If the Euclidean distance is less than the feature distance threshold, and the pixel to be examined is spatially adjacent to any pixel in the current growth region, then the pixel to be examined is merged into the current growth region, and the growth boundary is updated. Based on the two key conditions of feature similarity and spatial connectivity, the rationality of region growth is improved. Specifically, 4-neighborhood connectivity can be used to determine the spatial adjacency between pixels; this dual constraint mechanism effectively prevents the erroneous merging of feature-similar but discontinuous regions, ensuring the geometric integrity and boundary clarity of the segmentation result.

[0098] The above process is repeated iteratively until no pixels satisfying the merging condition exist on the boundary of the current growth region. The growth process of the current growth region then terminates. This step ensures that the region growth process is sufficient and appropriate. Specifically, a maximum number of iterations can be set to prevent the algorithm from looping infinitely in some extreme cases. This embodiment, through an iterative mechanism, allows the region to gradually expand until it reaches its natural boundary, thereby avoiding overgrowth or omissions and ensuring complete segmentation of the crack instance region.

[0099] Through the above technical solution, this application calculates the statistical distribution of crack confidence scores for all pixels in the crack confidence heatmap and determines the feature distance threshold for judging feature similarity based on preset quantiles. This achieves the adaptability of feature similarity judgment, allowing the threshold to dynamically adapt to image characteristics, effectively reducing noise interference and boundary blurring, thereby improving the accuracy of crack identification. Simultaneously, during region growing, the robustness of similarity judgment is enhanced by calculating the Euclidean distance between the pixel to be examined and the average feature vector of all seed points within the current growing region, and by performing inclusion judgment based on the spatial proximity relationship between pixels. This avoids erroneous merging of irrelevant regions and ensures clear segmentation boundaries. Furthermore, the iterative growing process continues until no pixels satisfying the inclusion conditions exist on the boundary, ensuring the integrity and efficiency of region growing. This effectively solves the problems of inaccurate feature similarity judgment and low efficiency in traditional region growing methods, making the segmentation results of ground cracks more accurate and reliable.

[0100] In one embodiment of this application, a machine vision-based method for assessing the risk of floor cracks further includes: In the initial stage of region growth, a first feature distance threshold is used for rapid expansion; When the area of ​​the current growth region reaches the preset stable area threshold, the growth is switched to the second feature distance threshold, where the second feature distance threshold is less than the first feature distance threshold. The first feature distance threshold and the second feature distance threshold are both calculated based on the variance of the crack confidence distribution of all pixels in the current growth region in the crack confidence heatmap.

[0101] In this embodiment, a first feature distance threshold is used for rapid expansion in the initial stage of region growth. The initial stage of region growth refers to the initial phase when the region growth algorithm begins execution and expands outward from the seed point, with the aim of quickly covering the core area of ​​the crack. This stage can be defined as the number of growth iterations being less than a preset value (e.g., the first few iterations), or as the number of pixels or the area of ​​the current growth region being less than a preset proportion (e.g., 0.1% of the total image area). The first feature distance threshold is a lenient standard used to determine the similarity of the pixel under investigation with the pixels in the current growth region. Its function is to allow for the rapid absorption of pixels with lower similarity requirements in the initial stage, accelerating region expansion. This threshold can be determined based on a higher quantile of the crack confidence distribution of all pixels in the crack confidence heatmap.

[0102] Once the area of ​​the current growth region reaches a preset stable area threshold, the algorithm switches to a second feature distance threshold for growth. The area of ​​the current growth region refers to the number of pixels or the actual physical area occupied by the aggregated pixel set during the region growth process, serving as the basis for determining the region growth stage switch. This area can be directly calculated by counting the number of pixels within the current growth region. The preset stable area threshold is a pre-set area value. When the area of ​​the current growth region reaches or exceeds this value, it signifies that the region growth has entered a stable stage from the initial stage, serving as a trigger condition for stage switching. This threshold can be determined through experiments and statistical analysis on a large number of ground crack images. Switching to the second feature distance threshold for growth means that once the region growth reaches the stable area threshold, the standard used to judge pixel similarity is adjusted from the first feature distance threshold to the second feature distance threshold. The second feature distance threshold is smaller than the first feature distance threshold, indicating a stricter similarity judgment standard. This ensures that in the later stages of region growth, only pixels highly similar to the core region features are included, thereby improving segmentation accuracy and boundary accuracy. This second feature distance threshold can be set based on the lower quantile of the crack confidence distribution.

[0103] Furthermore, both the first and second feature distance thresholds are calculated based on the variance of the crack confidence distribution of all pixels in the current growth region within the crack confidence heatmap. The crack confidence heatmap is an image where the value of each pixel represents the probability or confidence that the pixel belongs to the crack category; it can be obtained by a deep learning model (U-Net) performing pixel-level classification prediction on the fused feature map. The variance of the crack confidence distribution of all pixels in the current growth region is a statistical measure of the dispersion of crack confidence values ​​among all pixels in the current growth region, representing the uniformity of pixel features within the region. A larger variance indicates greater differences in pixel features within the region; a smaller variance indicates more consistent pixel features within the region. This variance can be statistically calculated after each region growth iteration. Through this calculation method, the threshold can adapt to regional characteristics rather than remaining fixed. For example, a function can be designed with the crack confidence distribution variance as input and the corresponding feature distance threshold as output.

[0104] Through the above technical solution, this application effectively solves the problem of balancing efficiency and accuracy in the region growth process. In the initial stage of region growth, a relatively lenient first feature distance threshold is used for rapid expansion, which can quickly capture the core area of ​​the crack, improving the growth efficiency in the initial stage and avoiding slow growth caused by an overly strict threshold. Once the growth region reaches a preset stable area threshold, a more stringent second feature distance threshold is dynamically switched to ensure that only pixels with highly consistent crack features are included after the region size expands. This effectively avoids overexpansion and noise interference, greatly improving the accuracy of crack segmentation and boundary precision. Furthermore, both the first and second feature distance thresholds are calculated based on the crack confidence distribution variance of all pixels in the current growth region, allowing the thresholds to be adaptively adjusted according to the uniformity of features within the region. When the differences in features within the region are large (large variance), a stricter threshold can be used to avoid misjudgment; when the features within the region are highly consistent (small variance), the threshold can be appropriately relaxed to promote growth. This adaptive dynamic threshold switching mechanism not only optimizes the stability and adaptability of the entire region growth process, but also improves the accuracy of crack instance region segmentation while maintaining high efficiency.

[0105] In one embodiment of this application, a risk index is calculated based on mechanism category, multi-dimensional feature set, and environmental data, and mapped to a risk level, including: Based on the mechanism category of cracks, weights are assigned to multiple feature indicators selected from the multi-dimensional feature set to obtain a weighted feature set; Based on the time-series information of temperature and humidity in environmental data, the weighted feature set is corrected to obtain the corrected feature set. The modified feature set and the crack activity index obtained based on infrared thermal image time series analysis are jointly input into the risk calculation model to calculate the risk index of the crack instance area. Based on the preset risk level classification rules, the risk index is mapped to the corresponding risk level.

[0106] In this embodiment, considering the differences in influencing factors and key characteristics among different crack mechanisms, weights are assigned to highlight the characteristics most relevant to the current mechanism, reducing interference from irrelevant characteristics and thus improving the targeting and accuracy of risk assessment. Specifically, an expert-based approach is used, where weights are manually assigned to characteristic indicators (crack width, depth, propagation rate, temperature sensitivity, etc.) under different mechanism categories based on the knowledge and experience of domain experts. For example, for thermal expansion and contraction cracks, the weights of characteristics related to temperature changes can be set higher; while for load stress cracks, the weights of crack width and propagation rate can be set higher.

[0107] Secondly, based on the time-series information of temperature and humidity in the environmental data, the weighted feature set is corrected to obtain the corrected feature set. Environmental temperature and humidity are important external factors affecting the behavior of floor cracks. By using their time-series information to correct the weighted feature set, the dynamic response of cracks under specific environmental conditions can be characterized, making the assessment results closer to reality and avoiding misjudgments caused by environmental changes. Specifically, linear correction is performed, adjusting relevant features in the weighted feature set (such as crack width and propagation rate) linearly or non-linearly according to the magnitude of changes in environmental temperature and humidity. For example, when the temperature rises, the crack width caused by thermal expansion and contraction may increase; in this case, the crack width feature can be positively corrected.

[0108] Next, the corrected feature set and the crack activity index obtained from infrared thermal image time-series analysis are input into the risk calculation model to calculate the risk index of the crack instance area. The risk calculation model is the core component for comprehensively assessing crack risk. By inputting the corrected multi-dimensional feature set (including morphological texture features, three-dimensional geometric features, and physical field features, and corrected by mechanism category and environmental data) and the crack activity index (which characterizes the dynamic change trend of cracks, such as distinguishing passive thermal response from active heat source features through infrared thermal image time-series analysis and correcting for temperature anomaly confidence), the potential hazards of cracks can be comprehensively and accurately quantified, generating a unified risk index. The specific risk calculation model is a multi-factor weighted summation model, which takes the corrected feature set and crack activity index as inputs and performs weighted summation through preset weights (which can be adjusted according to mechanism category) to obtain the risk index.

[0109] Finally, according to the preset risk level classification rules, the risk index is mapped to the corresponding risk level. Specific classification rules may include: one is fixed threshold classification, which sets a series of risk index thresholds based on industry standards, engineering specifications, or expert experience, and classifies cracks with risk indices falling within different ranges into different risk levels (low risk, medium risk, high risk, and extremely high risk).

[0110] Through the above technical solutions, this application effectively addresses the problems of insufficient correction for the impact of environmental data, lack of specificity in feature weight allocation leading to inadequate risk assessment accuracy, and difficulty in adapting to the dynamic changes of different crack mechanisms. By assigning weights to feature indicators based on crack mechanism categories, the weight allocation for cracks of different mechanism categories becomes more targeted, thus avoiding unreasonable weight allocation. Simultaneously, the weighted feature set is corrected based on the time-series information of temperature and humidity in the environmental data, enabling the feature set to dynamically adapt to external conditions and improving the accuracy and robustness of the correction. Furthermore, the corrected feature set and crack activity indicators obtained from time-series analysis of infrared thermal images are jointly input into the risk calculation model, integrating the dual information of corrected features and activity indicators, providing more comprehensive and reliable input data, thereby enhancing the reliability of the risk index calculation. Finally, the risk index is mapped to the corresponding risk level according to the preset risk level classification rules. This application, through the risk calculation process, improves the accuracy, specificity, and robustness of floor crack risk assessment, making the assessment results closer to actual crack behavior.

[0111] Corresponding to the machine vision-based floor crack risk assessment method in the above embodiment, Figure 2 This is a structural block diagram of a machine vision-based floor crack risk assessment system provided in one embodiment of this application. For ease of explanation, only the parts relevant to the embodiment of this application are shown. References Figure 2 The machine vision-based floor crack risk assessment system 20 includes: a data acquisition module 21, an image generation module 22, an image fusion module 23, an image segmentation module 24, a feature extraction module 25, a feature classification module 26, and a risk identification module 27.

[0112] The data acquisition module 21 is used to acquire image data of the target ground area and perform temporal and spatial registration to obtain registered image data. The image data includes: visible light image, infrared thermal image and three-dimensional point cloud data. Image generation module 22 is used to generate an infrared anomaly confidence map based on the registered infrared thermal image; Image fusion module 23 is used to fuse infrared anomaly confidence map, visible light image and three-dimensional point cloud data to generate fused feature map; Image segmentation module 24 is used to identify and segment independent crack instance regions based on fused feature maps through image segmentation processing; Feature extraction module 25 is used to extract features from the registered image data based on each crack instance region to obtain a multi-dimensional feature set; the multi-dimensional feature set includes: morphological texture features, three-dimensional geometric features and physical field features; The feature classification module 26 is used to input a multi-dimensional feature set into a preset classification model to obtain the mechanism category of the crack corresponding to the crack instance region; The risk identification module 27 is used to calculate risk indicators based on mechanism categories, multi-dimensional feature sets, and environmental data, and map them to risk levels.

[0113] See Figure 3 , Figure 3 This is a schematic block diagram of an electronic device provided according to an embodiment of this application. Figure 3 The electronic device 300 in this embodiment may include one or more processors 301, one or more input devices 302, one or more output devices 303, and one or more memories 304. The processors 301, input devices 302, output devices 303, and memories 304 communicate with each other via a communication bus 305. The memories 304 store computer programs, including program instructions. The processors 301 execute the program instructions stored in the memories 304. Specifically, the processors 301 are configured to invoke the program instructions to perform the functions of the modules in the aforementioned device embodiments, for example... Figure 2 The functions of the data acquisition module 21, image generation module 22, image fusion module 23, image segmentation module 24, feature extraction module 25, feature classification module 26, and risk identification module 27 are shown.

[0114] It should be understood that, in the embodiments of this application, the processor 301 may be a central processing unit (CPU), or it may be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or any conventional processor.

[0115] Input device 302 may include a touchpad, a fingerprint sensor (for collecting the user's fingerprint information and fingerprint orientation information), a microphone, etc., and output device 303 may include a display (LCD, etc.), a speaker, etc.

[0116] The memory 304 may include read-only memory and random access memory, and provides instructions and data to the processor 301. A portion of the memory 304 may also include non-volatile random access memory. For example, the memory 304 may also store device type information.

[0117] In specific implementations, the processor 301, input device 302, and output device 303 described in the embodiments of this application can execute the implementation methods described in any embodiment of the machine vision-based floor crack risk assessment method provided in the embodiments of this application, or they can execute the implementation methods of the electronic devices described in the embodiments of this application, which will not be repeated here.

[0118] Figure 4 This is a schematic diagram of the overall equipment for a machine vision-based floor crack risk assessment system provided in one embodiment of this application. Figure 4 As shown, the physical architecture includes: a multi-sensor acquisition unit, a mobile inspection platform, a control and processing center, and a display and interaction terminal. The components are connected via wired or wireless means.

[0119] A multi-sensor acquisition unit is integrated into the front end of the mobile inspection platform to simultaneously acquire multimodal raw data of the floor surface. This unit includes at least: a visible light camera to acquire visible light images of the floor surface to obtain information on the morphology, texture, and color of cracks; an infrared thermal imager to capture infrared thermal images of the floor surface and identify potential crack activity areas through temperature distribution anomalies; and a 3D laser scanner to acquire 3D point cloud data of the floor surface, quantifying the depth, width, and 3D geometry of cracks. A controllable ring LED light source provides stable, uniform, and adjustable illumination for the visible light camera, enabling the multi-sensor acquisition unit to obtain high-quality images under various ambient light conditions.

[0120] The data synchronization controller, as the core control module within the unit, is responsible for sending hardware synchronization trigger signals to each sensor, ensuring strict temporal and spatial alignment between visible light images, infrared thermal images, and 3D point cloud data. The mobile inspection platform carries and moves the aforementioned multi-sensor acquisition units. It can be designed as a manually trolley or an automated guided vehicle, enabling systematic and grid-based traversal data acquisition over large ground areas. The control and processing center, the system's computational and control brain, is typically integrated within the mobile platform or connected via wireless communication. It receives raw data from the acquisition units and executes the following sequentially: Data reception and preprocessing: Decoding, buffering, and formatting multiple incoming data streams.

[0121] Edge computing device: Deploys all or part of the algorithm software in the aforementioned method embodiments of this application to perform image registration, feature fusion, crack segmentation, mechanism classification and risk calculation in real time.

[0122] Generate a risk assessment report: Output a structured report containing information such as crack location, mechanism type, and risk level.

[0123] Display and interactive terminals, such as industrial-grade tablets or ruggedized laptops, are used for human-machine interaction. They display captured images, intermediate processing results, and system alerts in real time, allowing operators to view and export the final risk assessment report.

[0124] Through the collaborative work of the aforementioned hardware systems, the proposed method for assessing the risk of floor cracks can be implemented on-site, realizing a complete process from automated data acquisition to intelligent risk analysis.

[0125] In another embodiment of this application, a computer-readable storage medium is provided. This computer-readable storage medium stores a computer program, which includes program instructions. When executed by a processor, the program instructions implement all or part of the processes in the methods described above. Alternatively, the computer program can instruct related hardware to complete the process. The computer program can be stored in a computer-readable storage medium, and when executed by a processor, it can implement the steps of the various method embodiments described above. The computer program includes computer program code, which can be in the form of source code, object code, executable files, or certain intermediate forms. The computer-readable medium can include any entity or device capable of carrying computer program code, a recording medium, a USB flash drive, a portable hard drive, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM), a random access memory (RAM), an electrical carrier signal, a telecommunication signal, and a software distribution medium, etc.

[0126] The computer-readable storage medium can be an internal storage unit of the electronic device in any of the foregoing embodiments, such as a hard disk or memory of the electronic device. The computer-readable storage medium can also be an external storage device of the electronic device, such as a plug-in hard disk, smart media card (SMC), secure digital card (SD), flash card, etc., equipped on the electronic device. Furthermore, the computer-readable storage medium can include both internal and external storage units of the electronic device. The computer-readable storage medium is used to store computer programs and other programs and data required by the electronic device. The computer-readable storage medium can also be used to temporarily store data that has been output or will be output.

[0127] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this application.

[0128] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working process of the electronic devices and units described above can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.

[0129] In the several embodiments provided in this application, it should be understood that the disclosed electronic devices and methods can be implemented in other ways. For example, the device embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be indirect coupling or communication connection through some interfaces or units, or it may be an electrical, mechanical, or other form of connection.

[0130] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of the embodiments of this application, depending on actual needs.

[0131] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0132] The above are merely specific embodiments of this application, but the scope of protection of this application is not limited thereto. Any person skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope disclosed in this application, and these modifications or substitutions should all be covered within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. A machine vision-based method for assessing the risk of floor cracks, characterized in that, include: Image data of the target ground area is acquired and time and space registration is performed to obtain registered image data, which includes: visible light image, infrared thermal image and three-dimensional point cloud data; Based on the registered infrared thermal image, an infrared anomaly confidence map is generated; The infrared anomaly confidence map, visible light image and three-dimensional point cloud data are fused to generate a fused feature map. Based on the fused feature map, independent crack instance regions are identified and segmented through image segmentation processing; Based on each crack instance region, feature extraction is performed from the registered image data to obtain a multi-dimensional feature set; The multi-dimensional feature set is input into a preset classification model to obtain the mechanism category of the crack corresponding to the crack instance region; Based on the aforementioned mechanism category, the multi-dimensional feature set, and environmental data, risk indicators are calculated and mapped to risk levels.

2. The method for assessing the risk of floor cracks based on machine vision according to claim 1, characterized in that, The process of generating an infrared anomaly confidence map based on the registered infrared thermal image includes: Multi-scale spatial analysis is performed on the registered infrared thermal image to extract temperature gradient maps at different scales, and candidate abnormal thermal regions are identified based on the temperature difference threshold set according to the overall temperature statistical distribution of the infrared thermal image. Based on the statistical characteristics of pixels in non-candidate abnormal hot regions in the infrared thermal image, a background temperature distribution reference model is established. The statistical deviation between the temperature values ​​of the candidate anomalous thermal regions and the background temperature distribution reference model is calculated and used as the confidence level of the temperature anomaly. Based on the temperature anomaly confidence level and the temperature gradient map, an infrared anomaly confidence map is generated.

3. The method for assessing the risk of floor cracks based on machine vision according to claim 2, characterized in that, Also includes: Using the infrared anomaly confidence map as a spatial guide, the candidate anomaly hot regions in the infrared anomaly confidence map are mapped onto the three-dimensional point cloud data to locate the corresponding three-dimensional candidate point cloud clusters; The three-dimensional candidate point cloud cluster is fitted with a local surface to calculate the surface normal vector of each point within the cluster, and the divergence of the normal vector is calculated based on the directional distribution of the surface normal vector. Calculate the local Gaussian curvature absolute value distribution of the three-dimensional candidate point cloud cluster, and obtain the average Gaussian curvature absolute value of the three-dimensional candidate point cloud cluster; Based on the overall curvature distribution of the registered 3D point cloud data, a curvature anomaly threshold is set. If the confidence level of the temperature anomaly corresponding to the candidate abnormal hot region is greater than the preset temperature anomaly confidence threshold, the normal vector divergence is greater than the preset normal vector divergence threshold, and the absolute value of the average Gaussian curvature is greater than the preset curvature anomaly threshold, then the geometric consistency verification is passed. Based on the results of the geometric consistency verification, the temperature anomaly confidence scores of the corresponding regions in the infrared anomaly confidence map are weighted and fused to generate the final infrared anomaly confidence map.

4. The method for assessing the risk of floor cracks based on machine vision according to claim 2, characterized in that, Before generating the infrared anomaly confidence map, the following steps are also included: Registered infrared thermal image sequences of the target ground area at multiple different time points are obtained to form a time series dataset. Based on the candidate anomalous hot regions, the curves of their temperature changes over time are extracted from the time series dataset and used as temperature evolution curves. A time-series characteristic analysis was performed on the temperature evolution curve to distinguish between the passive thermal response caused by the ambient temperature difference and the active heat source characteristics caused by crack activity, and the results of the time-series characteristic analysis were obtained. Based on the results of the time-series feature analysis, the confidence level of the temperature anomaly corresponding to the candidate abnormal thermal region is corrected.

5. The method for assessing the risk of floor cracks based on machine vision according to claim 4, characterized in that, The time-series characteristic analysis of the temperature evolution curve, distinguishing between passive thermal response caused by environmental temperature difference and active heat source characteristics caused by crack activity, includes: The temperature evolution of the candidate anomalous thermal region is modeled as a state-space model; the state of the state-space model includes passive thermal components and active heat source components; The corresponding ambient temperature data sequence in the time series dataset is used as the observation input Kalman filter to recursively estimate the state space model, thereby decoupling the passive heat component sequence and the active heat source component sequence. The power of the active heat source component sequence is calculated, and its variation pattern over time is analyzed to obtain the rate of change. If the power of the active heat source component is greater than a preset power threshold and its rate of change is positive, then the candidate abnormal thermal region is determined to have active heat source characteristics caused by crack activity.

6. The method for assessing the risk of floor cracks based on machine vision according to claim 5, characterized in that, The step of correcting the confidence level of the temperature anomaly corresponding to the candidate anomalous thermal region based on the results of the time-series feature analysis includes: If the candidate abnormal thermal region is determined to have active heat source characteristics, then the confidence level of the temperature anomaly is positively increased according to the power magnitude of the active heat source component. If the candidate abnormal thermal region is determined to be a passive thermal response, the confidence level of the temperature anomaly is suppressed based on the correlation between its temperature change and the ambient temperature.

7. The method for assessing the risk of floor cracks based on machine vision according to claim 1, characterized in that, The step of fusing the infrared anomaly confidence map, the visible light image, and the 3D point cloud data to generate a fused feature map includes: Edge and texture feature maps are extracted from the registered visible light image, gradient feature maps are extracted from the infrared anomaly confidence map, and height and surface normal vector feature maps are extracted from the registered 3D point cloud data. The edge and texture feature maps, gradient feature maps, and height and surface normal vector feature maps are concatenated along the channel dimension and then input into a shallow convolutional network for preliminary fusion to obtain the first fused feature map. The visible light image, the infrared anomaly confidence map, and the two-dimensional depth map generated by projecting the three-dimensional point cloud data are respectively input into the encoder network with shared parameters to extract three sets of intermediate semantic features. The three sets of intermediate semantic features are interacted and fused through a cross-attention module to generate an intermediate fused feature map; The first fused feature map is subjected to preliminary binary segmentation of the crack region to obtain a first initial segmentation mask; The intermediate fusion feature map is subjected to preliminary binary segmentation of the crack region to obtain a second initial segmentation mask; Perform a logical AND operation between the first initial segmentation mask and the second initial segmentation mask to obtain the crack core region mask; Using the core region mask of the crack as spatial attention weight, feature enhancement and background suppression are performed on the intermediate fusion feature map to generate the final fusion feature map.

8. The method for assessing the risk of floor cracks based on machine vision according to claim 7, characterized in that, The step of performing a logical AND operation between the first initial segmentation mask and the second initial segmentation mask to obtain the crack core region mask includes: Perform a logical AND operation between the first initial segmentation mask and the second initial segmentation mask to obtain the first core region mask; Using the first core region mask as the seed region, region growth is performed on the intermediate fusion feature map to obtain the grown region mask; Morphological optimization of the growth region mask is performed, including removing isolated regions with an area smaller than a preset area threshold and filling holes to obtain an optimized crack core region mask.

9. The method for assessing the risk of floor cracks based on machine vision according to claim 1, characterized in that, The risk index, calculated based on the mechanism category, the multi-dimensional feature set, and environmental data, and mapped to a risk level, includes: Based on the mechanism category of the crack, weights are assigned to multiple feature indicators selected from the multi-dimensional feature set to obtain a weighted feature set; Based on the time-series information of temperature and humidity in the environmental data, the weighted feature set is corrected to obtain the corrected feature set; The modified feature set and the crack activity index obtained based on the time series analysis of the infrared thermal image are input into the risk calculation model to calculate the risk index of the crack instance area. According to the preset risk level classification rules, the risk index is mapped to the corresponding risk level.

10. A machine vision-based floor crack risk assessment system, characterized in that, include: The data acquisition module is used to acquire image data of the target ground area and perform temporal and spatial registration to obtain registered image data. The image data includes: visible light images, infrared thermal images, and three-dimensional point cloud data. The image generation module is used to generate an infrared anomaly confidence map based on the registered infrared thermal image; The image fusion module is used to fuse the infrared anomaly confidence map, the visible light image and the three-dimensional point cloud data to generate a fused feature map. The image segmentation module is used to identify and segment independent crack instance regions based on the fused feature map through image segmentation processing; The feature extraction module is used to extract features from the registered image data based on each crack instance region to obtain a multi-dimensional feature set; The feature classification module is used to input the multi-dimensional feature set into a preset classification model to obtain the mechanism category of the crack corresponding to the crack instance region; The risk identification module is used to calculate risk indicators based on the mechanism category, the multi-dimensional feature set, and environmental data, and map them to risk levels.