Bridge crack cause reasoning method and system based on knowledge graph

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using a knowledge graph-based method for reasoning about the causes of bridge cracks, and combining the virtual-real registration and graph calculation of UAV images and digital twin models, the problem of insufficient depth in tracing the source of defects in bridge crack detection has been solved, enabling three-dimensional quantitative assessment of cracks and scientific support for operation and maintenance decisions.

CN122244724APending Publication Date: 2026-06-19INTELLIGENT HUA TRANSPORTATION TECHNOLOGY (JIANGSU) CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: INTELLIGENT HUA TRANSPORTATION TECHNOLOGY (JIANGSU) CO LTD
Filing Date: 2026-02-11
Publication Date: 2026-06-19

Application Information

Patent Timeline

11 Feb 2026

Application

19 Jun 2026

Publication

CN122244724A

IPC: G06V20/17; G06V10/40; G06V10/26; G06V10/54; G06V10/75; G06V10/74; G06V10/10; G06N5/025; G06N5/04

AI Tagging

Application Domain

Character and pattern recognition Inference methods

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing technologies for bridge crack detection suffer from insufficient depth of disease tracing and lack of correlation between multi-source geometric data, resulting in isolated diagnostic results that are difficult to quantify and assess the degree of structural safety threat, and the diagnostic results lack interpretability.

Method used

A knowledge graph-based approach is adopted, which uses virtual-real spatial semantic registration between UAV imagery and bridge digital twin model, combined with dual-path image analysis model to extract two-dimensional morphology and three-dimensional geometric features of cracks, to construct a bridge damage knowledge graph. A graph topology traversal algorithm is used to generate the topological path of crack formation and output the judgment result.

Benefits of technology

It enables the mapping from two-dimensional images to three-dimensional structures, quantitatively assesses the potential hazards of cracks, provides assessment data with quantitative indicators of structural safety, and improves the scientificity and credibility of operation and maintenance decisions.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122244724A_ABST

Patent Text Reader

Abstract

This invention relates to the field of image detection technology, specifically to a method and system for reasoning about the causes of bridge cracks based on a knowledge graph. The method includes acquiring UAV imagery and positioning attitude data of the target area of the bridge; retrieving the bridge's digital twin model; spatially aligning the UAV imagery to generate a bridge projection dataset; extracting the two-dimensional morphological features and three-dimensional geometric features of the cracks using a dual-path image analysis model; reading the bridge ontology database to generate spatial location indices and morphological category vectors, which are instantiated as structural component nodes and damage appearance nodes, respectively, to construct a bridge damage knowledge graph; generating a topological path for crack causes through graph path traversal and calculating mapping weight values; and outputting a judgment result when the confidence threshold is exceeded. This invention constructs an image detection system from front-end data acquisition to back-end judgment result output, enabling defect identification reports to simultaneously represent the geometric depth information and spatial semantic information of cracks in a structured form.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of image detection technology, specifically to a method and system for reasoning about the causes of bridge cracks based on knowledge graphs. Background Technology

[0002] In the operation and maintenance of large-scale transportation infrastructure (such as cross-river bridges), automated detection and causal analysis of structural surface defects is an important technological application area. Current technologies generally employ drones equipped with high-resolution imaging devices, combined with digital image processing algorithms and texture gradient analysis techniques for detection. The basic logic is to calculate the differences in grayscale distribution and texture discontinuities in the structural surface image. When the pixel gradient change rate or feature response intensity reaches a specific threshold, edge contour segmentation and crack location are automatically performed within the acquired two-dimensional projected image frame. This is currently the mainstream technical method for comprehensive surveying of bridge appearance defects.

[0003] However, existing technical solutions have significant shortcomings in terms of the depth of disease tracing and the correlation of multi-source geometric data. Specifically, their two-dimensional image analysis logic used to characterize disease status suffers from a single spatial dimension and a lack of geometric topological information. Existing detection data is mostly recorded based on discrete single-view pixel coordinate systems, and its planar position information cannot be dynamically mapped to three dimensions with the real-time changes in macroscopic structural deformation. This makes it difficult to establish a spatial vector relationship between crack morphology and the structural root causes that induce them, resulting in directional deviations in the formulation of disease control strategies. Furthermore, static calculations based on pixel features only at the moment of image acquisition are insufficient to reveal the three-dimensional structural causes of defects. In addition, the output of diagnostic results suffers from isolation, making it difficult to quantify and assess the actual threat level of apparent structural diseases to structural safety, resulting in a lack of interpretability in the final operation and maintenance data.

[0004] To address this, a method and system for reasoning about the causes of bridge cracks based on knowledge graphs are proposed. Summary of the Invention

[0005] The purpose of this invention is to provide a method and system for reasoning about the causes of bridge cracks based on knowledge graphs, so as to solve the problems mentioned in the background art.

[0006] To achieve the above objectives, the present invention provides the following technical solution: a method for reasoning about the causes of bridge cracks based on knowledge graphs, comprising: Collect UAV imagery and positioning attitude data of the bridge target area, retrieve the bridge digital twin model corresponding to the positioning attitude data, and construct a virtual-real spatial semantic registration model by combining the UAV imagery; spatially align the UAV imagery using the virtual-real spatial semantic registration model to generate a bridge projection dataset. The bridge projection dataset is input into the dual-path image analysis model. By driving the texture segmentation branch and the disparity perception branch, feature extraction and disparity transformation operations are performed to extract the two-dimensional morphological features and three-dimensional geometric features of the crack. The model is then weighted and stitched together to output the defect semantic feature tensor. Read the bridge ontology library, parse the defect semantic feature tensor, generate spatial location index and morphological category vector, instantiate them as structural component nodes and damage appearance nodes respectively, establish semantic association edges, and construct a bridge damage knowledge graph. The graph topology traversal algorithm is used to perform graph path traversal on the damage appearance nodes in the bridge damage knowledge graph to generate crack cause topology paths; the mapping weight value of the crack cause topology paths is calculated, and the judgment result is output when it exceeds the confidence threshold.

[0007] Preferably, the specific generation process of the bridge projection dataset includes: calculating the initial extrinsic parameters of the UAV camera using positioning attitude data to establish an initial mapping matrix; extracting visual texture feature points from the UAV image and simultaneously retrieving the geometric structural projection points of the bridge digital twin model from the corresponding viewpoint, generating a set of corresponding feature points through feature matching operations; substituting the set of corresponding feature points into the bundle adjustment objective function, performing iterative convergence calculation on the initial mapping matrix, calculating the pose transformation parameters, and constructing a virtual-real space semantic registration model; driving the virtual-real space semantic registration model, reading the structural component attribute labels carried by the bridge digital twin model, inversely projecting the structural component attribute labels onto the two-dimensional pixel coordinate system of the UAV image, and outputting the bridge projection dataset through pixel-level semantic alignment operations.

[0008] Preferably, the specific process of iteratively converging the initial mapping matrix includes: extracting structural edge lines from the UAV image stream and retrieving the three-dimensional structural contour lines in the bridge digital twin model; constructing a reprojection distance constraint term and calculating the normal distance between the three-dimensional structural contour lines projected onto the two-dimensional pixel plane and the structural edge lines; simultaneously parsing the pixel-level semantic segmentation probability map of the UAV image, constructing a semantic consistency penalty term, and calculating the overlap loss between the entity projection region of the bridge digital twin model and the non-bridge category in the semantic segmentation probability map; introducing the reprojection distance constraint term and the semantic consistency penalty term into the pose optimization solver, jointly performing nonlinear least squares optimization with the feature point reprojection error, and outputting pose transformation parameters.

[0009] Preferably, the specific generation process of the defect semantic feature tensor includes: a dual-path image analysis model comprising a texture segmentation branch and a disparity perception branch; decomposing the bridge projection dataset into a set of reference keyframes and a set of adjacent multi-viewpoint sequence frames; inputting the set of reference keyframes into the texture segmentation branch, performing multi-scale dilated convolution operations to capture long-distance dependencies between pixels and extracting two-dimensional morphological features of the cracks; inputting the set of adjacent multi-viewpoint sequence frames into the disparity perception branch, constructing a cost volume reflecting photometric consistency under different depth assumptions based on a planar scanning algorithm, performing probability regression operations on the cost volume using a regularized convolutional network to generate three-dimensional geometric features of the cracks; and concatenating the two-dimensional morphological features of the cracks with the three-dimensional geometric features of the cracks through feature concatenation to output the defect semantic feature tensor.

[0010] Preferably, the specific construction process of the bridge damage knowledge graph includes: using a multi-task feature decoder to perform channel separation and dimensionality reduction parsing on the defect semantic feature tensor, decoupling and outputting spatial location index and morphological category vector; reading the entity mapping protocol defined in the bridge ontology library, using the spatial location index to retrieve the component unique identifier and local stress attributes of the corresponding grid unit in the bridge digital twin model, generating structural component nodes; using the morphological category vector to invert the physical values of crack orientation angle and crack width, generating damage appearance nodes; calling the structural mechanics topological constraint rules in the bridge ontology library, calculating the mechanical logic fit between the local stress attributes of the structural component nodes and the orientation angle of the damage appearance nodes, generating semantic association edges containing causation probability weights; connecting the semantic association edges to the structural component nodes and the damage appearance nodes, outputting the bridge damage knowledge graph.

[0011] Preferably, the specific generation process of the crack formation topology path includes: encoding the attribute features of the damage appearance nodes based on the bridge damage knowledge graph to generate an initial latent state embedding vector; collecting the stress attribute feature vectors of adjacent structural component nodes with the damage appearance nodes as aggregation centers using a graph topology traversal algorithm; calculating feature interaction weights using a self-attention mechanism, and updating the initial latent state embedding vector using the feature interaction weights; and traversing and searching the connected subgraphs based on the updated latent state embedding vectors, extracting the connected subgraph sequence as the crack formation topology path.

[0012] Preferably, the specific process for generating the determination result includes: mapping the crack cause topology path to a preset fault mode vector space, performing a similarity measurement operation with a standard fault template vector to generate a mapping weight value; performing a numerical comparison between the mapping weight value and a preset confidence threshold, and extracting the corresponding fault category label and cause description text when the threshold trigger condition is met; performing semantic encapsulation on the fault category label and cause description text, and outputting the determination result.

[0013] A knowledge graph-based inference system for the causes of bridge cracks includes: The spatial semantic registration module collects UAV images and positioning attitude data of the bridge target area, retrieves the bridge digital twin model corresponding to the positioning attitude data, and constructs a virtual-real spatial semantic registration model by combining UAV images. Through spatial alignment, it generates a bridge projection dataset. The geometry perception module inputs the bridge projection dataset into the dual-path image analysis model, and performs feature extraction and disparity transformation calculations by driving the texture segmentation branch and the disparity perception branch to extract the two-dimensional morphological features and three-dimensional geometric features of the cracks; through weighted stitching, it outputs the defect semantic feature tensor. The knowledge graph construction module reads the bridge ontology library, parses the defect semantic feature tensor, generates spatial location index and morphological category vector, instantiates them as structural component nodes and damage appearance nodes respectively, and establishes semantic association edges to construct a bridge damage knowledge graph. The cause determination module performs graph path traversal on the damage appearance nodes in the bridge damage knowledge graph using a graph topology traversal algorithm to generate crack cause topology paths; it calculates the mapping weight value of the crack cause topology paths, and outputs the determination result when the weight value exceeds the confidence threshold.

[0014] Compared with the prior art, the beneficial effects of the present invention are as follows: 1. By collecting the positioning and attitude data of unmanned aerial vehicles (UAVs) and constructing a virtual-real spatial semantic registration model, pixel-level crack features in the 2D image stream are mapped to specific structural units in the 3D digital twin model, giving isolated visual crack data clear 3D spatial coordinates and structural component semantics. Based on this deep virtual-real alignment, the pre-set component stress attribute weights in the digital twin model can be directly invoked to perform structural-level quantitative calculations of the potential hazards of cracks. This shift from pure visual recognition to structural semantic binding enables the operation and maintenance system to directly assess the actual threat level to the overall safety of the bridge based on the importance of the components to which the crack is attached, thereby outputting assessment data with quantitative indicators of structural safety.

[0015] 2. Utilizing a dual-path image analysis model incorporating a parallax perception branch, this approach simultaneously extracts the two-dimensional texture of cracks and captures their three-dimensional geometric features by analyzing the parallax variations across multi-viewpoint frame sequences. This architecture generates a defect semantic feature tensor containing depth information through weighted stitching. A physical geometric dimension verification mechanism is introduced during the feature extraction stage, ensuring that the final generated feature data not only includes surface morphological information but also incorporates three-dimensional spatial information reflecting crack depth. This physical geometric feature verification validates the crack's physical properties. This multimodal data fusion mechanism provides high-confidence input with geometric consistency verification for subsequent causal calculations, improving the reliability of crack physical property identification and effectively avoiding evaluation bias caused by a single data dimension.

[0016] 3. By parsing the semantic feature tensor of defects and constructing a bridge damage attribution knowledge graph, abstract data features are instantiated into structural component nodes and damage appearance nodes with physical meaning, and semantic associations are established based on mechanical ontology rules. Through path traversal of the graph neural network computation model, explicit crack cause topological paths are generated, clearly demonstrating the logical transmission relationship between "crack morphology—structural location—stress mode". This graph-based computational mechanism directly outputs judgment results containing clear causes, connecting originally discrete detection conclusions into a diagnostic report with causal logic. This ensures that each output maintenance suggestion has traceable logical support, providing a transparent and reliable basis for scientific maintenance decisions.

[0017] 4. By deeply coupling three core dimensions—virtual-real spatial semantic registration, dual-stream geometric perception, and knowledge graph computation—a closed-loop diagnostic system covering the entire chain from front-end data collection to back-end decision-making was constructed. Utilizing a digital twin model as a hub connecting the physical and digital worlds, discrete UAV visual data was transformed into structured information with geometric depth and spatial semantics. Simultaneously, using a knowledge graph as a logical engine, the structured information was elevated to diagnostic knowledge with causal relationships. This achieved a leap from discovering single apparent defects to comprehensive structural safety cognition, outputting not only the existence of cracks but also their spatial location sensitivity, physical geometry authenticity, and the logical basis of their mechanical causes. This provides a comprehensive structural health assessment report with both quantitative data support and complete logical explanation, enhancing the scientific rigor of operational decisions. Attached Figure Description

[0018] Figure 1 A flowchart illustrating a knowledge graph-based inference method for the causes of bridge cracks, as proposed in an embodiment of this invention application; Figure 2 This is a flowchart illustrating the geometric deformation and manifold space construction process proposed in an embodiment of this invention. Figure 3 This is a flowchart of the neural texture representation and image enhancement generation process proposed in an embodiment of this invention application; Figure 4 This is a structural diagram of a knowledge graph-based reasoning system for the causes of bridge cracks, as proposed in an embodiment of this invention. Detailed Implementation

[0019] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0020] Please see Figures 1-3 The present invention provides a method for reasoning about the causes of bridge cracks based on knowledge graphs, the specific steps of which are as follows: Collect UAV imagery and positioning attitude data of the bridge target area, retrieve the bridge digital twin model corresponding to the positioning attitude data, and construct a virtual-real spatial semantic registration model by combining the UAV imagery; spatially align the UAV imagery using the virtual-real spatial semantic registration model to generate a bridge projection dataset. The bridge projection dataset is input into the dual-path image analysis model. By driving the texture segmentation branch and the disparity perception branch, feature extraction and disparity transformation operations are performed to extract the two-dimensional morphological features and three-dimensional geometric features of the crack. The model is then weighted and stitched together to output the defect semantic feature tensor. Read the bridge ontology library, parse the defect semantic feature tensor, generate spatial location index and morphological category vector, instantiate them as structural component nodes and damage appearance nodes respectively, establish semantic association edges, and construct a bridge damage knowledge graph. The graph neural network computation model is used to traverse the damage appearance nodes in the bridge damage knowledge graph to generate crack cause topological paths; the mapping weight value of the crack cause topological path is calculated, and the judgment result is output when it exceeds the confidence threshold.

[0021] The technical solution of the present invention will be further described in detail below with reference to specific embodiments. Example 1

[0022] This application discloses a method for reasoning about the causes of bridge cracks based on knowledge graphs. (See attached document.) Figure 1The specific steps proposed in this invention include: S1. Acquiring UAV images and positioning attitude data of the bridge target area, retrieving the bridge digital twin model corresponding to the positioning attitude data, and constructing a virtual-real spatial semantic registration model in combination with the UAV images; spatially aligning the UAV images through the virtual-real spatial semantic registration model to generate a bridge projection dataset; S2. Inputting the bridge projection dataset into a dual-path image analysis model, performing feature extraction and disparity transformation operations through driving texture segmentation and disparity perception branches, extracting the two-dimensional morphological features and three-dimensional geometric features of cracks, and outputting a defect semantic feature tensor through weighted stitching; S3. Reading the bridge ontology library, parsing the defect semantic feature tensor, generating spatial location index and morphological category vector, instantiating them as structural component nodes and damage appearance nodes respectively, and establishing semantic association edges to construct a bridge damage knowledge graph; S4. Performing graph path traversal on the damage appearance nodes in the bridge damage knowledge graph through a graph neural network computing model to generate crack cause topological paths; calculating the mapping weight value of the crack cause topological paths, and outputting the judgment result when it exceeds the confidence threshold.

[0023] Further, UAV imagery and positioning attitude data of the bridge target area are collected, and a digital twin model of the bridge corresponding to the positioning attitude data is retrieved. A virtual-real spatial semantic registration model is constructed by combining the UAV imagery. The UAV imagery is then spatially aligned using the virtual-real spatial semantic registration model to generate a bridge projection dataset; this corresponds to step S1 above; see [link to relevant documentation]. Figure 2 The specific implementation process includes: Initial extrinsic parameters of the UAV camera are calculated using positioning and attitude data to establish an initial mapping matrix. Visual texture feature points are extracted from the UAV images, and the geometric projection points of the bridge digital twin model in the corresponding viewpoint are retrieved. A set of corresponding feature points is generated through feature matching. The set of corresponding feature points is substituted into the bundle adjustment objective function to iteratively converge the initial mapping matrix, solve the pose transformation parameters, and construct a virtual-real space semantic registration model. The virtual-real space semantic registration model is driven to read the structural component attribute labels carried by the bridge digital twin model. The structural component attribute labels are inversely projected onto the two-dimensional pixel coordinate system of the UAV images. Through pixel-level semantic alignment operations, the bridge projection dataset is output.

[0024] Specifically, the generation process of the set of feature points with the same name is as follows: This embodiment first utilizes a high-precision unmanned aerial vehicle (UAV) platform to perform automated data acquisition on the bridge target area, obtaining UAV image streams containing rich texture information and corresponding centimeter-level high-precision positioning attitude data. This process relies on an industrial-grade hexacopter flight platform equipped with a full-frame aerial survey camera. The flight altitude is set to ensure the ground sampling distance meets the requirements for identifying minute defects; for example, the acquisition distance is set to 5 to 8 meters from the structural surface to obtain a spatial resolution better than 1 millimeter per pixel. Subsequently, a digital twin model of the bridge matching the spatial range of the current positioning attitude data is retrieved. This model is built based on the LOD400 standard and includes geometric structure and component attributes.

[0025] Using the 3D coordinates of the camera center recorded by the UAV's onboard RTK real-time dynamic differential positioning system, combined with the Euler angles recorded by the gimbal, the initial extrinsic parameters of the camera in the world coordinate system are calculated, establishing an initial mapping matrix from the world coordinate system to the camera coordinate system. This matrix, constructed based on the standard ZYX rotation order, is a 4x4 homogeneous transformation matrix containing rotation and translation parameters. In the feature extraction stage, a semantic segmentation network based on the DeepLabV3+ architecture is used to analyze the UAV imagery. This network uses ResNet-101 as the backbone to extract high-level semantics, captures multi-scale context through a hollow spatial pyramid pooling module, generates pixel-level semantic masks, and removes background areas such as sky, water, and vegetation. Scale-invariant feature transformation points are extracted only within the structural mask region. Simultaneously, the geometric mesh of the bridge digital twin model is projected onto the virtual image plane using the initial pose, extracting geometric structure projection points within the visible range. Through descriptor matching operations, a set of corresponding feature points is established between the visual texture feature points of the UAV imagery and the geometric structure projection points of the digital twin model.

[0026] In this embodiment, after generating a set of identical feature points, a local topological constraint verification mechanism is introduced to calculate the variance of the angle between the vector formed by the feature point and its neighboring points. Mismatched points caused by repetitive textures on the concrete surface are eliminated, and the filtered set of identical feature points is output. To address the problem of mismatched features caused by highly repetitive concrete textures on the bridge surface, local topological consistency filtering is performed on the generated set of identical feature points. Specifically, for each matching pair (belonging to both UAV imagery and digital twin projection), an eight-neighborhood is constructed centered on that point. The vector set formed by the center point and each of its neighboring points on the image plane, as well as the vector set formed by the corresponding projection point and its neighboring projection points, are calculated. Subsequently, the angle difference matrix between the two sets of vector sets is calculated, and the average of the angle differences between all corresponding vectors is taken as the average angular deviation. A topological consistency threshold of fifteen degrees is set, and the selection of this threshold is positively correlated with the surface geometric curvature of the bridge component being measured. In this embodiment, for near-planar areas such as the web and bottom plate of the box girder, considering their relatively small projection distortion at small field of view angles, a tolerance of 15 degrees is selected to strictly eliminate mismatches caused by texture repetition. However, when detecting curved components such as bridge piers, this threshold can be appropriately increased based on the component's designed radius of curvature (e.g., set to 25 degrees) to accommodate deviations in normal perspective projection angles caused by continuous changes in surface normal vectors. If the average angular deviation of a feature point is greater than 15 degrees, it is determined that the local geometry surrounding that feature point has undergone non-rigid distortion or incorrect mapping during the mapping process. This is marked as an outlier and removed from the set, retaining only feature points that satisfy topological constraints for subsequent bundle adjustment calculations, effectively improving the geometric purity of the input data. This process effectively suppresses mismatch noise caused by repeated textures and reduces reprojection errors.

[0027] Extract structural edge lines from the UAV image stream and retrieve the 3D structural contour lines in the bridge digital twin model; construct a reprojection distance constraint term and calculate the normal distance between the 3D structural contour lines projected onto the 2D pixel plane and the structural edge lines; simultaneously analyze the pixel-level semantic segmentation probability map of the UAV image, construct a semantic consistency penalty term, and calculate the overlap loss between the entity projection region of the bridge digital twin model and the non-bridge category in the semantic segmentation probability map; introduce the reprojection distance constraint term and the semantic consistency penalty term into the pose optimization solver, and perform nonlinear least squares optimization in conjunction with the feature point reprojection error to output the pose transformation parameters.

[0028] Specifically, the process of constructing the virtual-real space semantic registration model and optimizing the pose is as follows: The core of this step is to construct the bundle adjustment objective function, which consists of three parts: a feature point reprojection error term, a reprojection distance constraint term, and a semantic consistency penalty term. For the reprojection distance constraint term, an edge detection algorithm is used to extract significant structural edge lines from the UAV imagery. Simultaneously, the corresponding 3D structural contour lines are retrieved from the digital twin model. The normal distance from the sampling points on the 3D contour lines, after pose transformation and projection onto the 2D plane, to the image structural edge lines is calculated. The sum of the squares of these distances constitutes the line feature constraint. For the semantic consistency penalty term, the percentage of overlapping pixels between the "bridge entity" category projection region in the digital twin model and the "non-bridge" category region (such as the background) in the semantic segmentation probability map of the UAV imagery is calculated as the overlap loss. The feature point reprojection error is obtained by calculating the Euclidean distance between the coordinates of the 3D feature points in the bridge digital twin model projected onto the 2D image plane under the current pose parameters and the coordinates of the corresponding feature points in the UAV imagery. The aforementioned constraints are introduced into the Levenberg-Marquardt optimization solver, and a gradient magnitude normalization strategy is used to set the weight coefficients: the feature point reprojection error is weighted at 1.0, the reprojection distance constraint at 0.5, and the semantic consistency penalty at 10.0 (to balance the dimensional differences). Nonlinear least squares optimization is then performed jointly, continuously correcting the camera's rotation matrix and translation vector during iteration. When the error is less than 10... -6 The algorithm stops when the maximum number of iterations (100) is reached, thus outputting the optimal pose transformation parameters.

[0029] By utilizing the geometric constraints of the structural edge lines and the three-dimensional contour lines, as well as the global consistency constraints of the semantic segmentation probability graph, a multimodal joint optimization objective function is constructed. Through this dual constraint mechanism, mismatched points can be effectively eliminated, and the strict consistency between the UAV viewpoint and the twin model viewpoint can be effectively promoted. This makes the mapping position of cracks on the digital twin more accurate, thereby more realistically reflecting the actual threat level of structural defects to specific stress-bearing parts and improving the credibility of quantitative assessment.

[0030] After obtaining high-precision pose transformation parameters, the virtual-real spatial semantic registration model is driven to perform inverse projection operations, reading the rich structural component attribute labels stored in the bridge digital twin model, including unique component codes, component types (such as box girder web, bottom plate, and piers), material properties, and design stress states. Using the optimized pose parameters, a ray casting algorithm is employed to accurately map these 3D attribute labels back to the 2D pixel coordinate system of the UAV imagery. Each pixel in the image is indexed to determine the corresponding 3D component affiliation, thereby generating a semantic label map and component attribute map that are strictly aligned with the original RGB imagery. The final output bridge projection dataset not only contains the original high-resolution visual imagery but also carries spatial location information and structural attribute information pixel by pixel. For example, the crack pixel with coordinates (x, y) in the image is directly associated with the specific "3rd span, 2nd segment web" component.

[0031] By combining texture feature points and geometric projection points for joint bundle adjustment, pixel-level alignment between UAV imagery and digital twin models is effectively promoted. By inversely projecting the structural component attribute labels carried in the digital twin model onto the 2D imagery, automated semantic information transfer is achieved, eliminating subjective errors. Furthermore, a strong correlation is established between crack pixels and specific bridge components, laying a solid data foundation for solving the problem of isolated diagnostic results and improving the accuracy of node attributes during subsequent knowledge graph construction.

[0032] Furthermore, the bridge projection dataset is input into a dual-path image analysis model. Feature extraction and disparity transformation operations are performed by driving the texture segmentation branch and the disparity perception branch to extract the two-dimensional morphological features and three-dimensional geometric features of the cracks. These are then weighted and stitched together to output a defect semantic feature tensor; this corresponds to step S2 above. The specific implementation process includes: The dual-path image analysis model includes a texture segmentation branch and a disparity perception branch. The bridge projection dataset is deconstructed into a set of reference keyframes and a set of adjacent multi-viewpoint sequence frames. The set of reference keyframes is input into the texture segmentation branch, where multi-scale dilated convolution operations are performed to capture long-distance dependencies between pixels and extract the two-dimensional morphological features of the cracks. The set of adjacent multi-viewpoint sequence frames is input into the disparity perception branch, where a cost volume reflecting photometric consistency under different depth assumptions is constructed based on a planar scanning algorithm. A regularized convolutional network is used to perform probabilistic regression on the cost volume to generate the three-dimensional geometric features of the cracks. The two-dimensional morphological features of the cracks and the three-dimensional geometric features of the cracks are concatenated through feature concatenation to output a defect semantic feature tensor.

[0033] This embodiment uses the bridge projection dataset as input data for a dual-path image analysis model. This model employs a unique dual-branch architecture designed to simultaneously capture the surface texture details and deep spatial geometric information of cracks. First, the input bridge projection dataset is deconstructed along the temporal dimension, dividing it into a reference keyframe set and a set of adjacent multi-viewpoint sequence frames. The reference keyframe set is fed into the texture segmentation branch, while the adjacent multi-viewpoint sequence frame set is fed into the disparity perception branch. The two branches operate in parallel; the texture segmentation branch extracts high-dimensional features through deep convolution, while the disparity perception branch calculates depth information using multi-viewpoint geometric principles.

[0034] Specifically, the texture segmentation branch and feature extraction process are as follows: This branch is based on an improved DeepLabV3+ network architecture, using ResNet-101 as the backbone feature extraction network. The input data consists of pre-processed 512x512 pixel reference keyframe image patches. To address the issue of elongated and sparsely distributed crack morphology, the network introduces a dilated spatial pyramid pooling module. This module uses dilated convolutional layers with dilation rates of 6, 12, and 18 in parallel, expanding the receptive field without reducing feature map resolution, thus enabling the capture of long-range dependencies between pixels. After layer-by-layer processing by the encoder and decoder, the output is a two-dimensional crack morphology feature containing 256 feature channels.

[0035] Specifically, the parallax perception branch and the generation process of stereo geometric features are as follows: This branch utilizes a set of adjacent multi-view sequence frames to construct a cost volume based on a planar scanning algorithm. It sets a depth assumption range and matches left-view features with right-view features at different disparity levels, constructing a 4D cost volume with dimensions of the maximum disparity multiplied by one-quarter of the height, one-quarter of the width, and twice the number of channels. Subsequently, a regularized network composed of 3D convolutional layers is used to aggregate the cost volume, learning contextual consistency. Finally, a probabilistic regression operation is performed on the cost volume using a Soft Argmax differentiable function to calculate the spatial depth value of each pixel relative to the camera. This depth value reflects the microscopic geometric unevenness of the crack area relative to the surrounding smooth concrete surface, thus generating a 3D geometric feature of the crack that can distinguish between real physical cracks and planar two-dimensional stains (such as oil stains and watermarks).

[0036] In this embodiment, after constructing the cost volume, an information entropy confidence filtering layer is introduced to calculate the probability distribution entropy value of the cost volume in the disparity dimension. Anisotropic diffusion smoothing is performed on weak texture regions with high entropy values to suppress noise at depth discontinuities. Since the bridge surface has large areas of weak texture, direct regression easily produces step-like artifacts. Entropy masking is performed on the constructed four-dimensional cost volume. First, the cost volume is processed by a normalized exponential function along the disparity dimension to obtain the probability distribution. Using the Shannon entropy principle, the disparity information entropy of each pixel is calculated by multiplying the probability value corresponding to each disparity level with its logarithm, summing the products, and then inverting the result. The entropy threshold is set to 1.8 bits. When the entropy value of a pixel is greater than 1.8 bits, it indicates that the depth judgment of that point is ambiguous (exhibiting multi-peak or uniform distribution characteristics). At this time, the anisotropic diffusion operator is activated. Centered on that pixel, a weighted average smoothing is performed using the cost vectors of low-entropy pixels (i.e., high-confidence pixels) within its five-by-five neighborhood, with the reciprocal of the Euclidean distance as the weight, resetting the cost vector of that point. The entropy threshold is set based on 25% of the maximum theoretical information entropy of the disparity search space (i.e., one-quarter of the base-2 logarithm of the preset maximum disparity level; in this embodiment, the maximum disparity level is set to 192). When the lighting conditions of the detection scene are poor or the concrete surface produces specular reflection that degrades texture features, the probability distribution of the cost volume tends to flatten, and the calculated entropy value will approach the maximum theoretical entropy. At this time, by setting this relative proportion threshold, low-confidence areas that cause depth calculation divergence due to insufficient ambient light can be adaptively filtered out without manually resetting fixed values for different ambient brightness. This process instantly corrects holes and noise in the cost volume, enabling the subsequent regularization network to focus on effective geometric edges, eliminating depth estimation artifacts in weak texture areas, and improving the measurement accuracy of crack depth.

[0037] Specifically, the feature fusion, model training, and loss function are as follows: The texture segmentation branch outputs a 2D morphological feature map with specific feature map height, width, and the number of segmentation feature channels (e.g., 128 channels), while the disparity perception branch outputs a stereo geometric feature map with corresponding feature map height, width, and depth feature channels (e.g., 64 channels). Before concatenation, bilinear interpolation is used to uniformly adjust the two sets of feature maps to the same reference spatial resolution. Finally, this intermediate joint feature tensor is input into a 1x1 convolutional layer. Using the learnable weight parameters obtained from training this convolutional layer, the feature information of different channels is linearly weighted and combined, and the number of channels is simultaneously compressed to a preset dimension (e.g., 256 dimensions), thereby outputting a fused defect semantic feature tensor. During the model training phase, a dedicated training dataset containing input-ground pairs was constructed. The ground truth labels for crack segmentation are generated using a manually finely annotated binary mask; while the depth ground truth labels for the crack's 3D geometric features are obtained by automatically generating a depth map by projecting a high-precision 3D mesh of the bridge's digital twin model onto the image plane. During this process, a depth buffer algorithm in the graphics rendering pipeline automatically handles the occlusion relationships between 3D components, retaining only the depth values of the most visible foreground surface as valid ground truth values, thus ensuring a more rigorous pixel-level correspondence between the generated depth map and the 2D image. An end-to-end joint optimization strategy is employed. The loss function consists of two parts: a focus loss for the segmentation task, designed to address the severe imbalance between crack and background pixel ratios, with a focus parameter set to 2 and a balance factor set to 0.25; and a smoothing L1 loss for the disparity regression task, used to constrain the accuracy of depth prediction. The overall optimization objective function is defined as: the total loss equals the product of the focus loss, the smoothing L1 loss, and the balance factor, where the balance factor is initially set to 1.0 and linearly decays to 0.1 with increasing training iterations. The optimizer used is AdamW, with an initial learning rate set to 10. -4 The model was then adjusted using a cosine annealing strategy. To accelerate model convergence and improve generalization ability, a pre-training plus fine-tuning training strategy was adopted: First, the backbone networks of the texture segmentation branch and the disparity perception branch were pre-trained on a publicly available general scene dataset (such as Cityscapes) and the weight parameters were initialized; then, using the bridge-specific training dataset constructed above, the bottom feature extraction layer was frozen, and only the high-level semantic head and the regression head were fine-tuned with all parameters to adapt to the specific texture distribution of the bridge surface.

[0038] By employing multi-scale dilated convolution to capture long-range dependencies through a texture segmentation branch, the two-dimensional planar morphology of cracks is extracted. Meanwhile, a parallax perception branch recovers the depth and three-dimensional geometric information of cracks from multi-view sequences through planar scanning and cost volume regression. This design elevates crack recognition from planar images to three dimensions. Compared to relying solely on texture judgment, the three-dimensional geometric features incorporating depth information can more accurately reflect the crack's opening depth and volume. Furthermore, the output weighted splicing tensor provides high-dimensional features containing rich physical attributes for subsequent knowledge graphs, improving the quantitative accuracy and scientific rigor of causal reasoning.

[0039] Further, the bridge ontology database is read, the defect semantic feature tensor is parsed, and spatial location indexes and morphological category vectors are generated. These are instantiated as structural component nodes and damage appearance nodes, respectively, and semantic association edges are established to construct a bridge damage knowledge graph; corresponding to step S3 above; the specific implementation process includes: A multi-task feature decoder is used to perform channel separation and dimensionality reduction parsing on the semantic feature tensor of defects, decoupling the output spatial location index and morphological category vector. The entity mapping protocol defined in the bridge ontology library is read, and the spatial location index is used to retrieve the unique component identifier and local stress attributes of the corresponding grid unit in the bridge digital twin model, generating structural component nodes. The morphological category vector is used to invert the physical values of the crack's orientation angle and crack width, generating damage appearance nodes. The structural mechanics topological constraint rules in the bridge ontology library are called to calculate the mechanical logical fit between the local stress attributes of the structural component nodes and the orientation angle of the damage appearance nodes, generating semantic association edges containing causation probability weights. These semantic association edges are connected to the structural component nodes and the damage appearance nodes, outputting a bridge damage knowledge graph.

[0040] This embodiment, after obtaining the defect semantic feature tensor, reads a pre-built bridge ontology library. This ontology library is stored in a resource description framework format and hosted in a graph database. The entity types defined in the ontology library include not only structural components and damage appearances, but also building materials (such as concrete strength grade), environmental loads (such as wind speed and temperature field), and maintenance records. Furthermore, it pre-defines the attribute fields of structural component entities and damage appearance entities and their mapping relationships with the grid cells of the digital twin model. A key-value pair correspondence method from the defect semantic feature tensor to knowledge graph node attributes is specified through an entity mapping protocol. Specifically, the spatial location index is used to index the corresponding grid cell in the pixel-level semantic label graph and obtain the component's unique identifier and local stress attributes to instantiate it as a structural component node. The morphological category vector is used to indicate the crack's morphological category and, combined with the depth estimation results, inverts the crack's direction angle and crack width physical values to instantiate it as a damage appearance node. Subsequently, a multi-task feature decoder is used to perform deep analysis on the input defect semantic feature tensor, decoupling it into two sets of key data: the spatial location index and the morphological category vector. Based on the mapping protocol defined in the ontology library, these two sets of data are instantiated as "structural component nodes" and "damage appearance nodes" in the knowledge graph, respectively.

[0041] Specifically, the multi-task feature decoder and tensor decoupling process are as follows: The multi-task feature decoder is designed with a dual-head output architecture. The input is a 256-dimensional defect semantic feature tensor generated in the preceding steps. The first branch is the location regression head, consisting of two fully connected layers. It outputs a normalized spatial location index through a sigmoid activation function, corresponding to the original image coordinate system. The second branch is the attribute classification head, which outputs a morphological category vector through a softmax function. This vector contains the probability distributions of longitudinal cracks, transverse cracks, reticular cracks, and diagonal cracks. During training, the decoder is optimized using a multi-task loss function, combining the mean squared error loss for location regression and the cross-entropy loss for attribute classification. The weight coefficients are set to 1.0 and 0.5, respectively, to balance localization accuracy and classification accuracy.

[0042] Specifically, the node instantiation and physical parameter inversion process is as follows: For the generation of structural component nodes, the normalized spatial location index output by the multi-task feature decoder is multiplied by the width and height of the original image, and an inverse normalization operation is performed to recover the pixel coordinates of the crack center. Subsequently, this pixel coordinate is used as a query key to perform an index lookup in the pixel-level semantic label map, directly reading the unique identifier, component type (e.g., "3rd span - web - segment 5"), and design local stress attributes (e.g., "shear-dominant zone") stored at that coordinate location. For the generation of damage appearance nodes, the maximum probability term in the morphology category vector is analyzed, and the coordinates of all pixels in the crack region are extracted from the defect semantic feature tensor to construct a pixel set. If the morphology category vector indicates a longitudinal, transverse, or oblique crack, principal component analysis is performed on the pixel set, and the calculated direction of the first principal component is used as the orientation angle of the crack skeleton line. If the morphology category vector indicates a network crack, it is determined that the crack has no single orientation, and its morphology category attribute is directly recorded without performing numerical calculation of the orientation angle. Simultaneously, based on camera parameters and depth information, the pixel width is converted into the physical crack width (unit: millimeters). This conversion is calculated based on the principle of similar triangles: the camera's physical pixel size (unit: millimeters / pixel) and focal length (unit: millimeters) are obtained, and the depth value of the crack pixel (i.e., the distance from the crack surface to the camera's optical center) is calculated using the parallax perception branch. The pixel width of the crack in the image is multiplied by the depth value, and then divided by the focal length value, thereby calculating the actual physical width of the crack in three-dimensional space. For example, if a crack is detected at the web position, with a 45-degree orientation and a width of 0.3 millimeters, a damage appearance node will be generated, containing these attribute values. For complex cracks that cannot be classified into a single type (such as T-shaped or intersecting cracks), the morphology category vector output by the decoder will exhibit a multi-peak distribution. It is decomposed into multiple sub-segments of a single morphology to generate damage appearance nodes, and spatial adjacency edges are established.

[0043] Specifically, the process of constructing semantically related edges and calculating the mechanical logic fit is as follows: The process iterates through all generated structural component nodes and damage appearance node pairs, invoking structural mechanics topological constraint rules from the ontology library. For example, the rule library already defines prior knowledge such as "shear members (webs) are prone to diagonal cracks." Next, it calculates the cosine similarity between the "local stress attributes" (such as shear stress direction) of the structural component and the orientation angle of the damage appearance node. Specifically, the calculation process involves: first, converting the crack orientation angle values in the damage appearance node into unit direction vectors in a planar coordinate system; and second, extracting the theoretical crack-prone direction vectors (physically perpendicular to the design principal tensile stress direction) corresponding to the structural component nodes from the bridge digital twin model. These two vectors are then placed in the same coordinate system, and the absolute value of the cosine of the angle between the crack unit direction vector and the theoretical crack-prone direction vector is calculated. This absolute value directly reflects the geometric parallelism between the crack direction and the theoretically predicted direction. It is used as the mechanical logic fit degree. When the mechanical logic fit degree exceeds the preset threshold of 0.8, a semantic association edge is established. At the same time, the mechanical logic fit degree is normalized and used as the causal probability weight of the semantic association edge. In this way, isolated disease data is transformed into a bridge damage knowledge graph with causal implications. For example, a triple <structural node: web, association edge: mechanical fit (weight 0.9), damage node: diagonal crack> is generated, thus completing the construction of the knowledge graph.

[0044] By utilizing the spatial and morphological vectors decoupled from the multi-task decoder, the specific location and shape of the cracks were clarified. Furthermore, by invoking mechanical protocols from the ontology library, the logical compatibility between the structural stress attributes and the crack trajectory was calculated, ensuring that each semantically related edge contains a causal probability weight. This construction method internalizes the structural engineer's expert knowledge into graph rules, enabling the identification of critical cracks that truly conform to stress patterns. This further enhances the logical rigor and persuasiveness of assessments targeting structural safety threats.

[0045] Furthermore, a graph neural network computational model is used to traverse the damage appearance nodes in the bridge damage knowledge graph to generate a crack cause topological path; the mapping weight value of the crack cause topological path is calculated, and when it exceeds the confidence threshold, the judgment result is output; this corresponds to step S4 above; see [link to relevant documentation]. Figure 3 The specific implementation process includes: Based on the bridge damage knowledge graph, the attribute features of the damage appearance nodes are encoded to generate an initial latent state embedding vector. A graph topology traversal algorithm is used, with the damage appearance nodes as aggregation centers, to collect the stress attribute feature vectors of adjacent structural component nodes. A self-attention mechanism is used to calculate feature interaction weights, which are then used to weight and update the initial latent state embedding vector. Based on the updated latent state embedding vector, a traversal search of the connected subgraph is performed, and the sequence of connected subgraphs is extracted as the topological path for crack formation.

[0046] The crack formation topology is mapped to a pre-defined fault mode vector space, and a similarity measurement operation is performed with the standard fault template vector to generate a mapping weight value. The mapping weight value is compared with a pre-defined confidence threshold. When the threshold trigger condition is met, the corresponding fault category label and cause description text are extracted. The fault category label and cause description text are semantically encapsulated, and the judgment result is output.

[0047] This embodiment uses a bridge damage knowledge graph as input to drive a graph neural network computational model to execute a graph topology traversal algorithm. Using damage appearance nodes as aggregation centers, it actively collects the stress attribute feature vectors of structural component nodes connected to them via semantically related edges. Through iterative propagation of a multi-layer graph neural network, it calculates feature interaction weights using a self-attention mechanism, thereby weighting and updating the latent state embedding vectors of damage appearance nodes. This ensures that the latent state embedding vectors not only contain their own geometric morphology information but also incorporate the mechanical context information of the surrounding environment. Based on the updated latent state embedding vectors, it traverses and searches the bridge damage knowledge graph for highly responsive connected subgraphs, extracting the most explanatory connected subgraph sequence as the crack formation topology path. Subsequently, this path is mapped to a pre-defined high-dimensional fault mode vector space, and a similarity measurement operation is performed with the standard fault template vector. When the calculated mapping weight value exceeds a preset confidence threshold, a judgment logic is triggered, outputting the final judgment result containing fault category labels and causal description text.

[0048] Specifically, the process of the graph neural network computation model and feature aggregation mechanism is as follows: The graph neural network computational model employs a graph attention network architecture, designed to handle the non-Euclidean structure of graph data. The input layer receives the initial feature vector of each node, with the following specific data encoding settings: for continuous numerical physical attributes such as crack width and orientation angle, a multilayer perceptron is used to map them into high-dimensional continuous vectors; for discrete categorical attributes such as component type and stress area identification, an embedding layer is used to transform them into dense vectors. Subsequently, the continuous vector and dense vector are concatenated along the channel dimension, and then aligned dimensionally through a linear transformation layer, ultimately generating a unified 64-dimensional initial feature vector for that node, encoding information including crack width, orientation, component type, and stress area identification. In the feature aggregation stage, the model calculates the attention coefficient between each damage-appearing node and its neighboring structural component nodes. The calculation process first performs a linear transformation and concatenates the features of the current node and its neighbors, processes them through the LeakyReLU activation function, and then normalizes them using the Softmax function to obtain normalized attention weights reflecting the strength of the association between nodes. The model employs a multi-head attention mechanism, with eight independent attention heads computing in parallel. Each attention head processes input features in an independent feature subspace, meaning the initial feature vector's dimensions are evenly distributed across all attention heads (e.g., dividing the 64-dimensional features into eight groups of eight dimensions each). After each attention head independently computes its context vector, the original dimensions are restored through concatenation. Finally, a linear transformation matrix fuses the information from each head, followed by an ELU non-linear activation function to output an updated latent state embedding vector. During model training, the cross-entropy loss function from node classification is used as the optimization objective, and the network weights are adjusted through backpropagation, enabling the differentiation of disease nodes with different causes.

[0049] In this embodiment, a feature residual preservation mechanism is introduced when updating the latent state embedding vector. The initial feature vector before aggregation and the aggregated context vector are weighted linearly fused to prevent over-smoothing of node features during propagation. To avoid the loss of unique features of the damage manifestation node due to excessive aggregation of neighborhood information during multi-layer propagation in the graph neural network computation model—the so-called over-smoothing problem—a residual fusion operation is performed before generating the final latent state embedding vector. Specifically, the initial feature vector of the damage manifestation node before aggregation is temporarily stored in memory. After the self-attention mechanism calculates the context vector containing neighborhood force information, a balance coefficient is set, for example, 0.2. The initial feature vector is multiplied by this balance coefficient, and the context vector is multiplied by 1 and the complement of the balance coefficient is subtracted, resulting in 0.8. Then, the two weighted vectors are added element-wise. Subsequently, layer normalization is performed on the added vector, calculating the mean and variance of all elements in the vector, and standardizing and scaling it to obtain the final latent state embedding vector that contains rich contextual associations and retains its own physical properties. The value of this balance coefficient is negatively correlated with the average node degree of the bridge damage knowledge graph. In the locally sparse graph constructed in this embodiment, since each damage appearance node is only associated with a small number of structural component nodes, the risk of information oversmoothing is low. Therefore, a smaller balance coefficient (i.e., 0.2) is used to maximize the introduction of environmental stress context information. However, in high-density connection scenarios such as complex hub nodes (e.g., anchorage zones), this coefficient should be appropriately increased (e.g., adjusted to 0.5) to prevent excessive neighborhood noise information from masking the physical property characteristics of the node itself (e.g., crack width). The above process solves the feature homogenization problem in deep graph calculation, enabling the subtle differences in the causes of similar cracks to be distinguished, thus improving the calculation accuracy.

[0050] Specifically, the process of generating the crack formation topological path and mapping the vector space is as follows: Based on the updated latent state embedding vector, a greedy search algorithm is then executed: First, the currently diagnosed damage apparent node is set as the starting node, and a set of visited nodes is initialized to prevent path loop closure. In each search step, starting from the current node, all adjacent nodes are traversed, and the adjacent node with the largest attention weight value is selected as the next hop target and added to the set of visited nodes. The search terminates when any of the following stopping criteria are encountered: 1) the current node has no subsequent adjacent nodes; 2) the next hop target is already included in the set of visited nodes; 3) the path length reaches the preset maximum number of hops limit (e.g., five hops). The final extracted node sequence is the connected subgraph sequence, i.e., the crack causal topological path. For example, the path may point from "box girder web node" to "shear-dominant zone attribute", and then to "oblique crack node". The crack causal topological path is introduced into a fully connected layer as a readout function, mapping this variable-length topological path sequence into a fixed-length 128-dimensional fault mode vector. Simultaneously, a pre-set standard fault template vector library is retrieved. This library contains standard vector representations of typical fault modes such as "bending cracking," "shear cracking," and "protective layer rust expansion." First, a large amount of historical fault case data with manual annotations is collected and cleaned, with over 2000 historical bridge fault cases processed cumulatively. It is ensured that the number of valid samples corresponding to each sub-type fault mode (such as web diagonal cracks and bottom slab transverse cracks) is no less than fifty. The specific pre-training generation process is as follows: historical topological path samples of various known fault causes are collected and mapped to a set of feature vectors. For each fault mode (such as shear cracking), the arithmetic mean of all sample feature vectors under that mode is calculated, and this value is standardized to serve as the standard template vector representing that fault mode. The cosine similarity between the currently generated fault mode vector and each standard template vector in the library is calculated. This similarity is the mapping weight value, quantifying the degree of matching between the current fault features and known fault modes.

[0051] By aggregating the stress attributes of adjacent structural components around the damage apparent node and using a self-attention mechanism to calculate feature interaction weights, the system automatically focuses on the structural factors that contribute the most to crack formation. The generated connected subgraph sequence shows the transmission chain between "component stress - crack morphology - potential causes" in an explicit path form, enabling managers to trace the root cause of cracks.

[0052] Specifically, the threshold determination and result output process is as follows: A confidence threshold of 0.85 is set, which is graded according to the structural importance level of bridge components: for critical load-bearing components such as the web and tension zone, a high confidence threshold of 0.85 is used to ensure the accuracy of diagnostic results and avoid false alarms leading to invalid road closures for maintenance; for non-load-bearing decorative components, the threshold can be appropriately increased to 0.9; conversely, if applied to emergency monitoring scenarios of old and dangerous bridges, in order to follow the principle of 'maximum safety redundancy' and prevent any potential risks from being missed, the threshold can be lowered to 0.75, thereby achieving a balance between recall and precision that meets the current operation and maintenance goals. When the mapping weight value of a certain fault mode exceeds this threshold, a successful match is determined. For example, if the weight value of the "shear cracking" mode is 0.92, the output logic is triggered. The corresponding fault category label "structural shear damage" is extracted, and the associated causal description text is retrieved, such as "excessive principal tensile stress leads to diagonal cracks in the web". If the weights of all patterns are below the threshold, a message indicating "cause unknown" or "requires manual review" is output. The fault mode vector corresponding to the currently generated crack cause topology path is marked as an unknown anomaly sample and stored in the pending database. Engineers are then prompted to perform cluster analysis and manual annotation on these samples to generate new standard fault template vectors. Finally, the fault category labels, cause description text, and visualization data of the reasoning path are semantically encapsulated to generate a judgment result containing a complete chain of "disease-cause-recommendation," directly assisting maintenance personnel in making reinforcement decisions.

[0053] By setting a confidence threshold, the high reliability of the output results is ensured. Furthermore, the complex graph reasoning process is transformed into decision support information that is easy for operations and maintenance personnel to understand. The risk level is quantified and the source of the risk is explained, thereby improving the interpretability of operations and maintenance data.

[0054] This invention provides a knowledge graph-based method for reasoning about the causes of bridge cracks. By constructing a closed-loop system for analyzing the causes of bridge cracks based on a knowledge graph, it effectively solves the problems of isolated diagnostic results, difficulty in quantifying and assessing structural safety threats, and lack of interpretability. First, by utilizing the spatial semantic registration of UAV imagery and digital twin models, it breaks through the limitations of relying solely on image data and endows cracks with structural spatial attributes. Second, the dual-path image analysis model simultaneously extracts two-dimensional texture and three-dimensional geometric features, enabling a more comprehensive quantification of the actual physical damage caused by cracks to the structure. Furthermore, by constructing a knowledge graph containing structural component nodes and damage appearance nodes, and using a graph neural network computing model for path traversal, discrete crack features are transformed into topological paths with causal logic, achieving a quantitative calculation of the degree of structural safety threat. It also provides a complete chain of evidence, improving the interpretability of the final maintenance data and providing a scientific basis for bridge maintenance decisions. Example 2

[0055] This embodiment applies a knowledge graph-based bridge crack cause reasoning system to the health monitoring of large-scale transportation infrastructure such as prestressed concrete continuous box girder bridges. (See also...) Figure 4 The specific implementation process is as follows: In practical operation, the spatial semantic registration module serves as the data entry point and preprocessing center. This module receives the 45-megapixel high-resolution raw image stream transmitted from the UAV flight platform via a wireless communication interface, along with synchronously recorded RTK centimeter-level positioning attitude data. The indexing engine integrated within the module uses the positioning data to retrieve the pre-stored LOD400 level bridge digital twin model in real time, automatically matching the 3D mesh data within the view frustum range. This module not only performs coordinate transformation but, more importantly, drives the virtual-real spatial semantic registration model. Through the reprojection distance constraint and semantic consistency optimization algorithm described in Example 1, it corrects the initial pose drift of the UAV sensor, accurately mapping the component codes, material properties, and design stress states carried by the digital twin model to the 2D image pixels, generating a bridge projection dataset with pixel-level semantic annotations.

[0056] Furthermore, the geometry perception module receives the aforementioned bridge projection dataset and, as the core of the intelligent perception engine, deploys a dual-path image analysis model. When processing data from the web region of a specific box girder, this module activates the texture segmentation branch and the disparity perception branch in parallel. The texture branch focuses on extracting two-dimensional morphological features such as crack edges and textures from reference keyframes, clearly identifying micro-cracks as narrow as 1 millimeter. Simultaneously, the disparity branch utilizes the geometric disparity information of adjacent sequence frames to calculate the depth and concavity features of the crack region, effectively eliminating artifacts such as oil stains and water stains that are similar in texture but have smooth surfaces. Through feature cascading operations, the module fuses morphological features with solid geometric features, outputting a high-dimensional defect semantic feature tensor containing information on crack location, orientation, width, and depth.

[0057] The knowledge graph construction module includes a pre-built parser and a bridge ontology library. When the system detects a 45-degree, 0.4 mm wide crack in the web region of the third span from the received defect semantic feature tensor, it instantiates the spatial location index as a "3rd Span - Web - Segment 5" structural component node and the crack feature as a "Diagonal Crack" damage appearance node. Next, based on the structural mechanics rules in the ontology library, it retrieves the stress attributes of the web (such as shear-dominant regions) and calculates the mechanical logic fit with the crack direction. When conditions are met, a semantic association edge containing causal probability weights is established between the two types of nodes, thus transforming the discrete defect detection results into a graph structure with causal logic.

[0058] The cause determination module performs automated diagnosis based on the constructed graph. This module drives a graph neural network computing model, aggregating the stress characteristics of surrounding structural components with the "oblique crack" node as the center. In actual operation, the model generates a topological path for crack cause through graph traversal and maps it to a pre-set fault mode vector space. The system calculation found that the similarity weight between this path vector and the standard "structural shear damage" fault template reached 0.94, exceeding the confidence threshold of 0.85. The module then triggers the determination logic, extracts the corresponding fault category label and the causal description text "excessive principal tensile stress leading to oblique crack in the web," and combines it with reinforcement suggestions to generate the final intelligent diagnostic report. Through the collaborative work of various modules, this system achieves end-to-end automated output from raw image data to professional structural diagnostic conclusions, effectively improving the scientific nature of bridge operation and maintenance decisions and ensuring that the final defect identification report is structured information with geometric depth and spatial semantics.

[0059] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.

Claims

1. A method for reasoning about the causes of bridge cracks based on knowledge graphs, characterized in that, include: Collect UAV imagery and positioning attitude data of the bridge target area, retrieve the bridge digital twin model corresponding to the positioning attitude data, and construct a virtual-real spatial semantic registration model by combining the UAV imagery; spatially align the UAV imagery using the virtual-real spatial semantic registration model to generate a bridge projection dataset. The bridge projection dataset is input into the dual-path image analysis model. By driving the texture segmentation branch and the disparity perception branch, feature extraction and disparity transformation operations are performed to extract the two-dimensional morphological features and three-dimensional geometric features of the crack. The model is then weighted and stitched together to output the defect semantic feature tensor. Read the bridge ontology library, parse the defect semantic feature tensor, generate spatial location index and morphological category vector, instantiate them as structural component nodes and damage appearance nodes respectively, establish semantic association edges, and construct a bridge damage knowledge graph. The graph topology traversal algorithm is used to perform graph path traversal on the damage appearance nodes in the bridge damage knowledge graph to generate crack cause topology paths; the mapping weight value of the crack cause topology paths is calculated, and the judgment result is output when it exceeds the confidence threshold.

2. The method for reasoning about the causes of bridge cracks based on knowledge graphs according to claim 1, characterized in that, The specific generation process of the bridge projection dataset includes: calculating the initial extrinsic parameters of the UAV camera using positioning attitude data to establish an initial mapping matrix; extracting visual texture feature points from the UAV images and simultaneously retrieving the geometric structural projection points of the bridge digital twin model from the corresponding viewpoint, generating a set of corresponding feature points through feature matching operations; substituting the set of corresponding feature points into the bundle adjustment objective function, performing iterative convergence calculations on the initial mapping matrix, calculating pose transformation parameters, and constructing a virtual-real space semantic registration model; driving the virtual-real space semantic registration model, reading the structural component attribute labels carried by the bridge digital twin model, inversely projecting the structural component attribute labels onto the two-dimensional pixel coordinate system of the UAV images, and outputting the bridge projection dataset through pixel-level semantic alignment operations.

3. The method for reasoning about the causes of bridge cracks based on knowledge graphs according to claim 2, characterized in that, The specific process of iteratively converging the initial mapping matrix includes: extracting structural edge lines from the UAV image stream and retrieving the 3D structural contour lines in the bridge digital twin model; constructing a reprojection distance constraint term and calculating the normal distance between the 3D structural contour lines projected onto the 2D pixel plane and the structural edge lines; simultaneously parsing the pixel-level semantic segmentation probability map of the UAV image, constructing a semantic consistency penalty term, and calculating the overlap loss between the entity projection region of the bridge digital twin model and the non-bridge category in the semantic segmentation probability map; introducing the reprojection distance constraint term and the semantic consistency penalty term into the pose optimization solver, and jointly performing nonlinear least squares optimization with the feature point reprojection error to output pose transformation parameters.

4. The method for reasoning about the causes of bridge cracks based on knowledge graphs according to claim 1, characterized in that, The specific generation process of the defect semantic feature tensor includes: a dual-path image analysis model comprising a texture segmentation branch and a disparity perception branch; decomposing the bridge projection dataset into a set of reference keyframes and a set of adjacent multi-viewpoint sequence frames; inputting the set of reference keyframes into the texture segmentation branch, performing multi-scale dilated convolution operations to capture long-distance dependencies between pixels and extracting two-dimensional morphological features of the cracks; inputting the set of adjacent multi-viewpoint sequence frames into the disparity perception branch, constructing a cost volume reflecting photometric consistency under different depth assumptions based on a planar scanning algorithm, performing probability regression operations on the cost volume using a regularized convolutional network to generate three-dimensional geometric features of the cracks; and concatenating the two-dimensional morphological features of the cracks with the three-dimensional geometric features of the cracks through feature concatenation to output the defect semantic feature tensor.

5. The method for reasoning about the causes of bridge cracks based on knowledge graphs according to claim 1, characterized in that, The specific construction process of the bridge damage knowledge graph includes: using a multi-task feature decoder to perform channel separation and dimensionality reduction parsing on the defect semantic feature tensor, decoupling and outputting spatial location index and morphological category vector; reading the entity mapping protocol defined in the bridge ontology library, using the spatial location index to retrieve the component unique identifier and local stress attributes of the corresponding grid unit in the bridge digital twin model, generating structural component nodes; using the morphological category vector to invert the physical values of crack direction angle and crack width, generating damage appearance nodes; calling the structural mechanics topological constraint rules in the bridge ontology library to calculate the mechanical logical fit between the local stress attributes of the structural component nodes and the direction angle of the damage appearance nodes, generating semantic association edges containing causation probability weights; connecting the semantic association edges to the structural component nodes and the damage appearance nodes, outputting the bridge damage knowledge graph.

6. The method for reasoning about the causes of bridge cracks based on knowledge graphs according to claim 1, characterized in that, The specific generation process of the crack formation topology path includes: encoding the attribute features of the damage appearance nodes based on the bridge damage knowledge graph to generate an initial latent state embedding vector; collecting the stress attribute feature vectors of adjacent structural component nodes with the damage appearance nodes as aggregation centers using a graph topology traversal algorithm; calculating feature interaction weights using a self-attention mechanism, and updating the initial latent state embedding vector using the feature interaction weights; and traversing and searching the connected subgraphs based on the updated latent state embedding vectors, extracting the connected subgraph sequence as the crack formation topology path.

7. The method for reasoning about the causes of bridge cracks based on knowledge graphs according to claim 1, characterized in that, The specific process for generating the determination result includes: mapping the crack cause topology path to a preset fault mode vector space, performing a similarity measurement operation with the standard fault template vector to generate a mapping weight value; performing a numerical comparison between the mapping weight value and a preset confidence threshold, and extracting the corresponding fault category label and cause description text when the threshold trigger condition is met; performing semantic encapsulation on the fault category label and cause description text, and outputting the determination result.

8. A knowledge graph-based reasoning system for the causes of bridge cracks, characterized in that, include: The spatial semantic registration module collects UAV images and positioning attitude data of the bridge target area, retrieves the bridge digital twin model corresponding to the positioning attitude data, and constructs a virtual-real spatial semantic registration model by combining UAV images. Through spatial alignment, it generates a bridge projection dataset. The geometric perception module inputs the bridge projection dataset into the dual-path image analysis model, and performs feature extraction and disparity transformation calculation by driving the texture segmentation branch and the disparity perception branch, thereby extracting the two-dimensional morphological features and three-dimensional geometric features of the cracks. By weighted concatenation, the defect semantic feature tensor is output; The knowledge graph construction module reads the bridge ontology library, parses the defect semantic feature tensor, generates spatial location index and morphological category vector, instantiates them as structural component nodes and damage appearance nodes respectively, and establishes semantic association edges to construct a bridge damage knowledge graph. The cause determination module performs graph path traversal on the damage appearance nodes in the bridge damage knowledge graph using a graph topology traversal algorithm to generate crack cause topology paths; it calculates the mapping weight value of the crack cause topology paths, and outputs the determination result when the weight value exceeds the confidence threshold.