Building structure damage identification and evaluation system fusing multi-point strain sensing and ai

By integrating multi-point strain sensing with AI into a building structure damage identification system, the problem of the distribution of continuous physical fields across the entire domain and the correlation between physical mechanisms and apparent phenomena in existing technologies has been solved. This enables accurate identification and predictive assessment of early damage and provides a quantitative preventive maintenance solution.

CN122241578APending Publication Date: 2026-06-19HUNAN HONGSHANG DETECTION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HUNAN HONGSHANG DETECTION TECH CO LTD
Filing Date
2026-03-18
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing building structure monitoring methods struggle to achieve continuous physical field distribution and effective correlation between physical mechanisms and apparent phenomena across the entire domain in complex damage scenarios, resulting in poor early or internal damage identification and susceptibility to environmental interference.

Method used

The building structure damage identification system, which integrates multi-point strain perception and AI, generates a damage-sensitive feature map and performs semantic segmentation through data synchronization and alignment, dynamic strain feature extraction, physical feature field construction, cross-modal data injection, coupled feature mining, and quantitative damage identification modules. It then combines a long short-term memory network for predictive assessment.

Benefits of technology

It enables the effective capture of minute damage caused by early stress concentration, reduces the probability of missed and false detections, provides a quantitative basis for preventive maintenance decisions, and can accurately identify and predict damage development trends.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122241578A_ABST
    Figure CN122241578A_ABST
Patent Text Reader

Abstract

This invention belongs to the technical field of computer vision and structural health monitoring, and relates to a building structure damage identification and assessment system integrating multi-point strain perception and AI. The system includes: a data synchronization and alignment module, which acquires multiple types of signals and images from nodes and generates an initial spatiotemporal dataset; a dynamic strain feature extraction module, which extracts multi-dimensional dynamic strain feature vectors reflecting structural stiffness characteristics; a physical feature field construction module, which interpolates to construct a continuous physical feature field grid; a cross-modal data injection module, which generates a composite physical visual feature tensor; a coupled feature mining module, which mines associated features to generate a damage-sensitive feature map; and a damage quantitative identification module, which outputs the final identification result. This invention solves the problem in existing technologies where physical state monitoring and apparent visual detection are independent, leading to a lack of direct correlation between the internal mechanical state of the structure and surface damage characteristics, thus affecting the comprehensiveness and accuracy of damage identification and assessment.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the technical field of computer vision and structural health monitoring, and relates to a building structure damage identification and assessment system that integrates multi-point strain perception and AI. Background Technology

[0002] In the field of civil engineering, continuous health monitoring of large building structures is a crucial means of ensuring service safety and preventing accidents. Typically, the health status of a building structure is determined by both its internal physical and mechanical properties and its external appearance. Accurately and comprehensively identifying and assessing structural damage, especially early or internal damage, during monitoring is a key focus in the current field of structural health monitoring.

[0003] Existing technical solutions are mainly divided into two categories: one is the sensor-based physical monitoring method, which collects dynamic response data by deploying sensors such as strain gauges at pre-set sensor deployment nodes on the structure, and then analyzes the changes in mechanical parameters; the other is the vision-based detection method, which uses image acquisition equipment to obtain images of the structural surface and identifies apparent defects such as cracks and spalling through image processing or deep learning algorithms.

[0004] Both of the aforementioned methods have certain technical limitations in complex damage scenarios. Sensor-based monitoring methods, due to the discrete nature of measurement points, struggle to provide a continuous physical field distribution across the entire area, easily overlooking local damage between sensor sampling points, and lacking an intuitive correspondence between physical parameters and the geometric shape of the damage. Visual inspection methods, on the other hand, typically only capture visible surface changes, limiting their effectiveness in detecting early-stage damage caused by internal stress concentration that has not yet formed macroscopic cracks, and are easily affected by lighting and surface contaminants. Because the data analysis processes of these two technical approaches are relatively independent, it is difficult to establish an effective correlation between physical mechanisms and apparent phenomena. Summary of the Invention

[0005] To address the aforementioned problems, this invention provides a building structure damage identification and assessment system that integrates multi-point strain sensing and AI.

[0006] A building structure damage identification and assessment system integrating multi-point strain sensing and AI, including: The data synchronization and alignment module acquires the original strain sequence signals with a predetermined sampling frequency collected by strain sensors distributed at the preset sensor deployment nodes of the building structure, the real-time temperature data collected by temperature sensors, and the real-time surface images acquired by the vision acquisition terminal. It establishes a mapping relationship between the image pixel coordinates and the three-dimensional spatial coordinates of the strain sensors, and performs data alignment based on the timestamp to generate an initial spatiotemporal dataset. The dynamic strain feature extraction module performs sliding window framing and time-frequency domain analysis on the original strain sequence signal with a predetermined sampling frequency in the initial spatiotemporal dataset to extract a multi-dimensional dynamic strain feature vector that reflects the local stiffness and energy characteristics of the structure. The physical feature field construction module uses three-dimensional spatial coordinates as constraints to perform spatial interpolation on multi-dimensional dynamic strain feature vectors and construct a continuous physical feature field grid with the same resolution as the real-time surface image. The cross-modal data injection module extracts the color channel data of the real-time surface image, stacks and normalizes it with the physical feature channels contained in the continuous physical feature field grid in the depth dimension, and generates a composite physical visual feature tensor. The coupled feature mining module inputs a composite physical visual feature tensor into a deep feature extraction network to mine the correlation features between physical properties and visual texture, and generate a damage-sensitive feature map that reflects the location and shape of the damage. The damage quantitative identification module performs semantic segmentation on the damage-sensitive feature map to extract damage geometric parameters, compares the damage geometric parameters with a preset structural health threshold matrix, and outputs the structural damage identification results.

[0007] A further aspect of the present invention includes a data synchronization and alignment module, used to perform the following steps: Perform intrinsic parameter calibration and distortion correction on the visual acquisition terminal; The scale-invariant feature transform algorithm is used to extract feature matching point pairs between the corrected real-time surface image and the building information model rendered view. Based on feature matching point pairs, the PnP algorithm is used to solve the camera extrinsic parameters and construct the projective transformation matrix; By using a projective transformation matrix, the three-dimensional spatial coordinates in the building information model are projected onto the pixel plane of the real-time surface image, thus achieving registration between physical space and pixel space.

[0008] A further aspect of the present invention includes a dynamic strain feature extraction module, which performs the following operations: Acquire real-time temperature data from sensor deployment points and perform temperature compensation correction on signals within each sampling frame; Calculate the arithmetic mean of the corrected signal within each sampling frame to obtain the static strain component; The main frequency of each sampling frame is extracted using Fast Fourier Transform; The energy components of each frequency band are calculated by wavelet packet decomposition and summed to obtain the root mean square value of dynamic strain; Calculate the rate of change of the dominant frequency with time to obtain the frequency deviation characteristics; The static strain components, dominant frequency, root mean square value of dynamic strain, and frequency offset characteristics are vectorized and concatenated to generate a multi-dimensional dynamic strain feature vector.

[0009] A further aspect of the present invention includes a physical feature field construction module, used to perform the following operations: The multi-dimensional dynamic strain feature vector is separated into independent feature component datasets; The inverse transformation of the projective transformation matrix is ​​used to construct a ray pointing from the pixel to the physical space. The intersection of the ray and the geometric surface of the building information model is solved by the ray tracing algorithm to determine the unique three-dimensional spatial coordinates of each pixel. Retrieve structural topology data from the building information model to determine whether pixel coordinates and sensor deployment points belong to the same structural component. Under structural topology constraints, the radial basis function interpolation algorithm is used to calculate the interpolation feature value at the unique three-dimensional spatial coordinate of the pixel, with the three-dimensional spatial coordinate of the sensor in the same component as the control point. The static strain field map, the dominant frequency distribution field map, and the energy dissipation field map are generated separately, and these three are integrated in the depth direction to generate a continuous physical feature field grid.

[0010] A further aspect of the present invention includes a cross-modal data injection module, used to perform the following steps: Decompose the real-time surface image to obtain data for the three color channels: red, green, and blue. Static strain field map, dominant frequency distribution field map, and energy dissipation field map are used as three physical characteristic channels; The red, green, and blue color channel data are stacked with the three physical feature channels to form a six-channel data set; The min-max normalization method is used to normalize the six-channel data at the pixel level for each channel, generating a composite physical vision feature tensor.

[0011] A further embodiment of the present invention includes a coupled feature mining module, used to perform the following steps: Multi-channel convolutional kernels are used to encode composite physical visual feature tensors, and intermediate feature maps containing texture and physical gradients are extracted. A cross-attention mechanism is introduced to map features from color channel data to a query matrix and features from physical feature channels to a key matrix and a value matrix. The dot product of the query matrix and the key matrix is ​​calculated to obtain the relevance weights. The relevance weights are then used to perform a weighted summation of the value matrix to identify regions of interest that are physically abnormal and visually sensitive. The region of interest is upsampled and projected to generate a damage-sensitive feature map.

[0012] A further aspect of the present invention includes a damage quantitative identification module, used to perform the following operations: The damage-sensitive feature map is binarized and connected components are extracted; The skeletonization algorithm is used to extract the skeleton lines of connected components, and the Euclidean physical distance between the center points of adjacent pixels on the skeleton lines is accumulated to determine the crack length. The total number of pixels in the connected components is counted and multiplied by the physical area represented by a single pixel to determine the damaged area; The mean value of the physical field within the connected domain is calculated by backtracking the continuous physical feature field grid to determine the stress response intensity; Based on the positions of crack length, damaged area, and stress response intensity in the structural health threshold matrix, the physical type and severity level of damage are determined, and structural damage identification results are generated by combining three-dimensional spatial coordinates.

[0013] A further embodiment of the present invention includes a time-series evolution prediction module, which is used to perform the following steps: Obtain structural damage identification results from multiple consecutive sampling periods and construct a time-series dataset of damage geometric parameters; Input the time series dataset into the long short-term memory network to predict the damage geometry parameters at future time steps; A predictive assessment report containing damage evolution trends is generated based on the prediction results.

[0014] A further aspect of the present invention, in generating a predictive evaluation report, includes: Calculate the slope of the change in damage geometry parameters over time; The critical threshold is obtained by querying the structural limit state database based on the current damage physical type. Using the slope of change and the critical threshold, the critical time point at which the damage expands to the critical state is calculated by linear extrapolation and written into the predictive assessment report.

[0015] A further aspect of this invention involves performing local mesh correction on the building information model based on the three-dimensional spatial coordinates and damage geometric parameters in the structural damage identification results to generate a finite element calculation model. Apply a preset specific load condition and solve the finite element calculation model; The stress redistribution state of the damaged area and its surroundings is simulated, and the simulation results are used as a basis for risk assessment and written into the predictive assessment report.

[0016] In summary, the present invention has the following beneficial technical effects: 1. This invention transforms discrete strain sensor data into a continuous physical feature field grid and stacks it with image color channels using tensors, achieving pixel-level alignment of physical properties and visual information during the data input stage. Compared to the traditional method of processing physical parameters and visual images separately and then summarizing the results, this cross-modal data injection method allows the model to directly learn the coupling relationship between physical stress distribution and apparent texture features in spatial coordinates, providing a multi-dimensional input foundation for damage identification.

[0017] 2. By introducing preprocessing steps such as camera calibration, distortion correction, and real-time temperature compensation, this invention effectively reduces the impact of spurious strain noise and optical perspective distortion caused by environmental temperature differences on data quality. Simultaneously, by utilizing shared convolutional kernels to extract common features of physical field gradients and visual texture gradients, and combining this with a cross-attention mechanism to enhance the weights of regions with abnormal physical properties and sensitive visual features, the system can effectively capture minute damage features caused by early stress concentration, reducing the probability of missed or false detections caused by single visual detection or single physical monitoring.

[0018] 3. This invention not only extracts the geometric parameters of damage through semantic segmentation and skeletonization algorithms, but also traces back the physical characteristic field to obtain the stress response intensity, achieving an objective measurement of the degree of damage. By introducing a long short-term memory network to model historical monitoring data, the system can analyze the evolution of damage parameters and predict future trends. Combining finite element model correction and stress redistribution simulation, this invention extends the assessment scope from current static diagnosis to future dynamic prognosis, providing a quantitative decision-making basis for preventive maintenance of building structures. Attached Figure Description

[0019] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the accompanying drawings used in the description of the embodiments or the prior art will be briefly introduced below. The drawings are used to provide a further understanding of the present invention.

[0020] Figure 1 This is a schematic diagram of the framework in the embodiments of this application.

[0021] Figure 2 This is a flowchart illustrating an embodiment of this application. Detailed Implementation

[0022] The following is in conjunction with the appendix Figure 1 - Figure 2 A preferred description of the present invention is provided below.

[0023] See attached document Figure 1 - Figure 2 This invention proposes a building structure damage identification and assessment system that integrates multi-point strain sensing and AI, comprising the following modules: The data synchronization and alignment module acquires the original strain sequence signals with a predetermined sampling frequency collected by strain sensors distributed at the preset sensor deployment nodes of the building structure, the real-time temperature data collected by temperature sensors, and the real-time surface images acquired by the vision acquisition terminal. It establishes a mapping relationship between the image pixel coordinates and the three-dimensional spatial coordinates of the strain sensors, and performs data alignment based on the timestamp to generate an initial spatiotemporal dataset. The dynamic strain feature extraction module performs sliding window framing and time-frequency domain analysis on the original strain sequence signal with a predetermined sampling frequency in the initial spatiotemporal dataset to extract a multi-dimensional dynamic strain feature vector that reflects the local stiffness and energy characteristics of the structure. The physical feature field construction module uses three-dimensional spatial coordinates as constraints to perform spatial interpolation on multi-dimensional dynamic strain feature vectors and construct a continuous physical feature field grid with the same resolution as the real-time surface image. The cross-modal data injection module extracts the color channel data of the real-time surface image, stacks and normalizes it with the physical feature channels contained in the continuous physical feature field grid in the depth dimension, and generates a composite physical visual feature tensor. The coupled feature mining module inputs a composite physical visual feature tensor into a deep feature extraction network to mine the correlation features between physical properties and visual texture, and generate a damage-sensitive feature map that reflects the location and shape of the damage. The damage quantitative identification module performs semantic segmentation on the damage-sensitive feature map to extract damage geometric parameters, compares the damage geometric parameters with a preset structural health threshold matrix, and outputs the structural damage identification results.

[0024] In one embodiment of the present invention, the data synchronization and alignment module is used to perform the following steps: The intrinsic parameters of the visual acquisition terminal are calibrated and distortion is corrected. The scale-invariant feature transformation algorithm is used to extract feature matching point pairs between the corrected real-time surface image and the rendering view of the building information model. Based on the feature matching point pairs, the PnP algorithm is used to solve the camera extrinsic parameters and construct the projective transformation matrix. The projective transformation matrix is ​​used to project the three-dimensional spatial coordinates in the building information model onto the pixel plane of the real-time surface image to achieve registration between physical space and pixel space.

[0025] Specifically, a central data processing server constructs an initial dataset aligned in both time and space dimensions. The central data processing server sends synchronization trigger commands via industrial Ethernet to data acquisition units of multiple strain sensors and associated temperature sensors deployed on key stress nodes of the building structure. The commands are precisely timed using the Network Time Protocol (NTP) to ensure that all data acquisition units initiate data acquisition at the same moment. Each data acquisition unit acquires raw strain sequence signals at a predetermined sampling frequency and transmits the digital signal, including timestamps and strain values, back to the central data processing server in real time via TCP / IP protocol.

[0026] Meanwhile, the central data processing server loads the unique identifier of each strain sensor and its pre-calibrated three-dimensional spatial coordinate information from the locally stored sensor spatial coordinate configuration file, and binds the coordinate information with the original strain sequence signal of the predetermined sampling frequency returned by the corresponding sensor to form a multi-channel original strain data stream with spatial labels and stores it in the dynamic data cache area.

[0027] Simultaneously, the central data processing server sends acquisition commands to visual acquisition terminals deployed in fixed locations or mounted on drones, initiating the acquisition of real-time surface images of the building structure. The visual acquisition terminals described in this invention include, but are not limited to, industrial cameras, surveillance cameras, infrared thermal imagers, or image acquisition devices mounted on drones. The visual acquisition terminals transmit timestamped image frames to the central data processing server via the Real-time Streaming Protocol (RTSP). After receiving the real-time surface images, the server first performs distortion correction preprocessing on the images using a preset camera intrinsic parameter matrix and distortion coefficients. Subsequently, it calls the image processing module and uses the Scale Invariant Feature Transform (SIFT) algorithm to extract feature points from the images. Simultaneously, from the pre-loaded Building Information Model (BIM), a corresponding 3D view is rendered based on the approximate pose of the visual acquisition terminals, and the corresponding SIFT feature points are extracted. By matching the two sets of feature points, a pair of corresponding points—two-dimensional pixel coordinates and three-dimensional world coordinates—is obtained. Based on these point pairs, the PnP algorithm is used to solve for the camera extrinsic parameters and construct a projective transformation matrix. This matrix determines the projective transformation relationship between the pixel coordinates of the real-time surface image and the three-dimensional spatial coordinates.

[0028] Finally, the central data processing server uses the timestamp of the received real-time surface image as the reference time point and traverses each spatially labeled raw strain data stream in the dynamic data buffer. For each data stream, the nearest neighbor sampling method is used to find and extract the strain data point closest to the reference time point's timestamp. The real-time surface image at the reference time point, all extracted strain data points, real-time temperature data, and their corresponding three-dimensional spatial coordinate information are combined into a data record and stored in the database, thereby generating a structured initial spatiotemporal dataset, providing a data foundation for subsequent feature extraction.

[0029] Establishing the projective transformation relationship between image pixel coordinates and 3D spatial coordinates involves calculating the mapping of 3D world coordinate points to 2D image plane coordinate points. This mapping relationship can be achieved using a 3×4 projective transformation matrix, the mathematical expression of which is as follows:

[0030] in, These are homogeneous coordinates on the image plane; the actual image pixel coordinates are... . It is the representation of three-dimensional spatial coordinates in a homogeneous coordinate system. It is the projective transformation matrix, consisting of the camera intrinsic parameter matrix K and the extrinsic parameter matrix. Combining, that is The matrix It is obtained by feature matching and PnP algorithm.

[0031] It should be noted that the raw strain sequence signal at the predetermined sampling frequency refers to the digital sequence converted from the raw electrical signal directly measured by the strain sensor without filtering or downsampling processing, and its unit is usually microstrain με. The sampling frequency is set according to the structural dynamic characteristics of the monitored object, typically within the range of 100 Hz to 1000 Hz, to ensure that the dynamic response of the structure under external loads can be captured and to avoid signal aliasing. The three-dimensional spatial coordinate information is the sensor's physical location coordinates (x, y, z) defined based on a global coordinate system or a building information model coordinate system, in meters. This information is accurately measured and pre-set in the system during the sensor installation phase using surveying equipment such as a total station.

[0032] Building Information Modeling (BIM) is a digital 3D model that includes building geometry, physical information, and functional information. It serves as a benchmark for achieving precise registration between pixel coordinates and physical world coordinates. The initial spatiotemporal dataset is a structured data collection, with its basic unit being a data frame. Each data frame contains a uniform timestamp, a real-time surface image acquired at that timestamp, and a list of multiple sensor readings. Each element in the list contains the sensor ID, the strain value at that timestamp, and the sensor's 3D spatial coordinates.

[0033] For example, suppose that at time T0, 2023-10-27 10:00:00.000, the central data processing server sends a synchronous acquisition command to the acquisition units of two strain sensors S-01 and S-02 installed on the main beam of a bridge. The configuration file shows that the three-dimensional spatial coordinates of S-01 are (50.2, 15.0, 30.5) meters, and the three-dimensional spatial coordinates of S-02 are (60.8, 15.0, 30.5) meters. Simultaneously, a visual acquisition terminal located on the bridge tower captures a 1920×1080 pixel real-time surface image, also timestamped at T0. SIFT feature matching is performed on this image and the corresponding view rendered from the BIM model, yielding over 100 matching point pairs, and the projective transformation matrix P is calculated based on this. Subsequently, the data alignment phase begins. Using T0 as a baseline, it searches for the record with the closest timestamp {timestamp:10:00:00.002, strain:210.5 με} in the data stream returned from S-01 and selects it. Similarly, it searches for the record with the closest timestamp {timestamp:10:00:00.001, strain:-88.2 με} in the data stream from S-02. Finally, a new record is generated in the initial spatiotemporal dataset. This record contains: a timestamp T0, associated 1920×1080 pixel image data, and a strain information list [{sensor_id:'S-01', coordinates:(50.2, 15.0, 30.5), strain:210.5}, {sensor_id:'S-02', coordinates:(60.8, 15.0, 30.5), strain:-88.2}]. This record is stored in the database, completing the single synchronous acquisition and alignment process.

[0034] In one embodiment of the present invention, the dynamic strain feature extraction module is used to perform the following steps: Real-time temperature data from sensor deployment points is acquired, and temperature compensation correction is performed on the signals within each sampling frame. The arithmetic mean of the corrected signals within each sampling frame is calculated to obtain the static strain component. The dominant frequency of each sampling frame is extracted using Fast Fourier Transform. The energy components of each frequency band are calculated and summed using wavelet packet decomposition to obtain the root mean square value of dynamic strain. The rate of change of the dominant frequency over time is calculated to obtain the frequency offset feature. The static strain component, dominant frequency, root mean square value of dynamic strain, and frequency offset feature are vectorized and concatenated to generate a multi-dimensional dynamic strain feature vector.

[0035] Specifically, the central data processing server extracts key features characterizing the structural dynamic response from the high-frequency time-series signals contained in the initial spatiotemporal dataset. The central data processing server first performs sliding window framing processing on the raw strain sequence signals of each strain sensor at a predetermined sampling frequency from the initial spatiotemporal dataset. The server sets a fixed-length data window and a window step size, and moves the window along the time axis, dividing the continuous raw strain sequence signals at the predetermined sampling frequency into a series of overlapping sampling frames. For each sampling frame, the server initiates multiple feature extraction subtasks in parallel.

[0036] The first subtask uses synchronously acquired temperature sensor data to perform temperature compensation on the strain signal according to the preset linear expansion coefficient, eliminating spurious strain caused by environmental temperature difference; then it calculates the arithmetic mean of all strain data points within the sampling frame, and uses this result as the static strain component characterizing the quasi-static load level of the structure during this time period.

[0037] The second subtask preprocesses the strain data within the sampled frame using a Hamming window function to reduce spectral leakage, and then performs a Fast Fourier Transform to convert the time-domain signal into a frequency-domain spectrum. The corresponding frequency value is determined by searching for the spectral line with the largest amplitude in the frequency-domain spectrum; this value is the signal's dominant oscillation frequency.

[0038] The third subtask first subtracts the previously calculated static strain component from the original signal of the current sampling frame to obtain a zero-mean dynamic strain signal. Next, the server invokes a wavelet packet decomposition algorithm, using preset wavelet basis functions to perform multi-level decomposition of the dynamic strain signal, decomposing it into multiple orthogonal frequency bands. The energy components of each frequency band are obtained by calculating the sum of squares of the wavelet packet coefficients within each band. Then, the square root of the sum of the energy components of all frequency bands is calculated to obtain the root mean square value of the dynamic strain, which reflects the energy intensity of the structural vibration.

[0039] The fourth subtask is responsible for extracting frequency offset features. It compares the dominant frequency calculated in the current sampling frame with the dominant frequency of the previous sampling frame stored in the buffer, calculates the ratio of the difference between the two to the time step, and obtains the rate of change of the dominant frequency.

[0040] Finally, the central data processing server organizes and splices the static strain components, dominant frequency, root mean square value of dynamic strain, and frequency offset characteristics calculated for each sampling frame into a one-dimensional numerical array in a predefined order, thereby generating a multi-dimensional dynamic strain feature vector. This vector is attached to the corresponding timestamp and sensor identifier for use in the next stage of spatial field construction.

[0041] In this embodiment, the calculation of the root mean square value of dynamic strain involves the integration of energy from each frequency band after wavelet packet decomposition. Assuming that after... After layer wavelet packet decomposition, the dynamic strain signal is decomposed into On the first frequency band, the first The set of coefficients for each frequency band is ,in Then the root mean square value of dynamic strain The calculation formula is:

[0042] in, It is the total number of sampling points in the current sampling frame. It is the first The number of coefficients within a frequency band It is the first in this frequency band The formula essentially calculates the total energy of the signal based on Passevar's theorem and normalizes it to the root mean square value at a single sampling point.

[0043] Frequency offset characteristics The calculation formula is:

[0044] It should be noted that, It is the dominant frequency of the current sampling frame. It is the dominant oscillator frequency of the previous sampling frame. This is the time interval between the center points of two sampling frames. The window length for sliding window framing is typically set to cover several cycles of the structure's lowest natural frequency, for example, corresponding to a data length of 2 to 5 seconds. The overlap rate is generally set between 50% and 75% to ensure a smooth transition of time-domain characteristics. The unit for static strain components is microstrain με. The dominant frequency is the vibration frequency at which the structure responds most significantly within the current time period, measured in Hertz (Hz). Its variation reflects changes in structural stiffness.

[0045] The root mean square value of dynamic strain, measured in microstrain με, is an important indicator of the magnitude of structural vibration energy. The wavelet basis functions for wavelet packet decomposition are selected based on signal characteristics; for example, the db series wavelets can be used for impact response signals, and the number of decomposition levels is typically set to 3 to 5. Frequency offset characteristics characterize the rate of change of the structural dynamic properties, measured in Hz / s; a non-zero value may indicate that damage is occurring or developing. The multidimensional dynamic strain feature vector is a numerical vector that incorporates multiple physical meanings; its fixed dimensions facilitate processing by subsequent machine learning models.

[0046] For example, continuing from the previous example, the system processes the raw strain sequence signal from sensor S-01 at a predetermined sampling frequency. Assume a window length of 1024 sampling points, an overlap rate of 50%, and a sampling frequency of 500 Hz. The system extracts a sampling frame containing time T0, where the data, centered at 210.5 με, exhibits sinusoidal oscillations superimposed on a slowly changing baseline. First, the arithmetic mean of these 1024 data points is calculated to obtain the static strain component. The value is 205.8 με. Next, a Fast Fourier Transform was performed on this data frame, and a spike was found at 5.2 Hz in the spectrum, thus determining the dominant oscillation frequency to be... The value is 5.2 Hz. Then, the mean value of 205.8 με is subtracted from the original signal, and the resulting dynamic signal is subjected to 4-level db4 wavelet packet decomposition to calculate the energy of 16 frequency bands. All energies are accumulated and calculated according to the formula to obtain the root mean square value of the dynamic strain. Finally, the dominant oscillator frequency calculated from the previous sampling frame was read from the buffer as 5.1 Hz, and the time step between the two frames was... The frequency offset characteristic was calculated using a time interval of 1.024 seconds, which is the time for 512 sampling points. Ultimately, a multi-dimensional dynamic strain feature vector was generated for sensor S-01 during this time period. The system performs the same processing on S-02 and all other sensors, generating a set of feature vectors carrying rich dynamic information for subsequent steps.

[0047] In one embodiment of the present invention, the physical feature field construction module is used to perform the following steps: The multi-dimensional dynamic strain feature vector is separated into independent feature component datasets. Rays pointing from pixels to physical space are constructed using the inverse transformation of the projective transformation matrix. The intersection points of the rays and the geometric surface of the building information model are solved by the ray tracing algorithm to determine the unique three-dimensional spatial coordinates of each pixel. The structural topology data in the building information model is retrieved to determine whether the pixel coordinates and the deployment points of each sensor belong to the same structural component. Under the structural topology constraints, the radial basis function interpolation algorithm is used to calculate the interpolation feature values ​​at the unique three-dimensional spatial coordinates of the pixel, with the three-dimensional spatial coordinates of the sensors within the same component as control points. Static strain field map, dominant frequency distribution field map, and energy dissipation field map are generated respectively, and these three are integrated in the depth direction to generate a continuous physical feature field grid.

[0048] Specifically, the central data processing server extends the discrete multi-dimensional dynamic strain feature vectors generated in the previous stage into a continuous physical field covering the structure surface through spatial interpolation techniques. First, the central data processing server reads the multi-dimensional dynamic strain feature vectors generated by all sensors at the same timestamp and their corresponding three-dimensional spatial coordinates from the cache.

[0049] The server separates the feature components in each vector, forming multiple independent discrete datasets indexed by three-dimensional spatial coordinates, corresponding to static strain, principal frequency, and root mean square value of dynamic strain, respectively. For each independent discrete dataset, the server employs a spatial nonlinear interpolation algorithm based on radial basis functions to construct a continuous feature field function. This algorithm uses the three-dimensional spatial coordinates of each sensor as control points, with their corresponding feature values ​​as the values ​​of the control points. By solving a system of linear equations, the weight coefficients of the radial basis functions are determined, thereby generating an interpolation model capable of predicting feature values ​​at any point in space.

[0050] Next, the central data processing server creates a blank two-dimensional target grid based on the resolution of the real-time surface image. Then, the server iterates through each pixel of this target grid. For each pixel coordinate, the server uses the inverse transformation of the established projective transformation relationship to construct a three-dimensional ray originating from the camera's optical center and passing through the pixel plane, and calls the ray tracing engine to calculate the coordinates of the first intersection point between this ray and the structural surface defined by the Building Information Model. These intersection point coordinates are the unique three-dimensional spatial coordinates corresponding to that pixel in the physical world.

[0051] Subsequently, the server inputs the three-dimensional spatial coordinates into the three previously constructed radial basis function interpolation models. To prevent strain signals from being incorrectly interpolated across physical fracture surfaces, such as beam-column gaps or settlement joints, the system introduces structural topology constraints from the Building Information Model: when calculating the interpolation of a certain pixel, the component ID of that point in the BIM model is first retrieved, and only sensors with the same component ID or located on a preset mechanical transmission path are selected as valid control points.

[0052] Subsequently, the interpolated static strain value, interpolated dominant frequency value, and interpolated root mean square value of dynamic strain at that point are calculated. The calculated interpolated static strain value is assigned to the position of the corresponding pixel in a matrix of the same size as the target grid. After filling, this matrix constitutes the static strain field map. Similarly, the generated interpolated dominant frequency value and interpolated root mean square value of dynamic strain are used to generate the dominant frequency distribution field map and the energy dissipation field map, respectively. Finally, the central data processing server stacks the three two-dimensional matrices of the static strain field map, dominant frequency distribution field map, and energy dissipation field map in the depth dimension, integrating them into a three-channel numerical array. This array is the continuous physical feature field grid, whose spatial resolution is completely consistent with the real-time surface image, ensuring pixel-level alignment.

[0053] In this embodiment, the spatial nonlinear interpolation operation used is radial basis function interpolation as an example. For any point in space... Its interpolation A set of radial basis functions Weighted sum representation:

[0054] in, These are the three-dimensional spatial coordinates of the point to be interpolated. It is the first The three-dimensional spatial coordinates of a strain sensor It represents the total number of sensors. These are the weighting coefficients to be solved. This represents the Euclidean distance between two points. These are radial basis functions, such as the Gaussian function. ,in These are shape parameters. Weighting coefficients. This is obtained by solving the following system of linear equations: , where the matrix elements , It is a vector composed of the feature values ​​measured by all sensors. Spatial nonlinear interpolation is a method for estimating the data distribution over an entire region from discrete sampled data. Compared to linear interpolation, it can better fit complex and changing physical fields. The static strain field map is a two-dimensional matrix with the same size as the pixels of the real-time surface image. The value of each element in the matrix represents the magnitude of the static strain on the physical structure surface at the corresponding pixel location.

[0055] The structures of the dominant frequency distribution field map and the energy dissipation field map are similar to those of the static strain field map, with their element values ​​representing the dominant frequency and root mean square value of the dynamic strain at the corresponding locations, respectively. The energy dissipation field map intuitively reflects the distribution of energy consumption in different parts of the structure during vibration. The continuous physical feature field grid is a three-dimensional array, with the first two dimensions matching the image resolution and the third dimension having a size of 3, storing the data from the three physical feature field maps mentioned above, thus realizing the continuous and multi-channel representation of physical features in pixel space.

[0056] For example, following the previous step, the system obtained a static strain value of 205.8 με for S-01 at coordinates (50.2, 15.0, 30.5) and a static strain value of -88.2 με for S-02 at coordinates (60.8, 15.0, 30.5). To make the interpolation valid, we assume there is a third sensor, S-03, located at (55.5, 20.0, 30.5), with a static strain value of 150.0 με. Interpolation will be performed on the static strain field map. First, the system constructs an RBF model and solves for the weights. , , Next, it is necessary to calculate the physical field value corresponding to a certain pixel in the real-time surface image, such as the center pixel (960, 540). Using inverse projection and BIM model lookup, the physical coordinates of this pixel on the structural surface are determined to be (55.0, 16.0, 30.5) meters.

[0057] Then, calculate The Euclidean distances between the three sensors are used to calculate the interpolated strain value at that point using the RBF interpolation formula. Assume the weights obtained by solving the linear equations are... , , And using Gaussian radial basis functions, the following was calculated: The interpolation result is The value 185.7 was then assigned to the position at coordinates (960, 540) in the static strain field map matrix. This process was repeated for each pixel in the image to generate a complete 1920×1080 static strain field map. The same process was applied to the dominant frequency and the root mean square value of the dynamic strain to generate the dominant frequency distribution field map and the energy dissipation field map. Finally, these three 1920×1080 matrices were stacked into a 1920×1080×3 continuous physical feature field grid, preparing the data for the next step of cross-modal fusion.

[0058] In one embodiment of the present invention, the cross-modal data injection module is configured to perform the following steps: The real-time surface image is decomposed to obtain three color channels: red, green, and blue. The static strain field map, the dominant frequency distribution field map, and the energy dissipation field map are used as three physical feature channels. The red, green, and blue color channel data are stacked with the three physical feature channels to form a six-channel data. The min-max normalization method is used to normalize the six-channel data at the pixel level for each channel to generate a composite physical visual feature tensor.

[0059] Specifically, the central data processing server performs deep fusion of spatially aligned visual information and physical feature information at the data structure layer. The central data processing server retrieves the digital matrix of the generated real-time surface image from local storage. Using an image processing library, such as OpenCV, the server decomposes the image matrix into independent red, green, and blue color channel data, each channel being a two-dimensional matrix with the same resolution as the original image.

[0060] Simultaneously, the server retrieves the generated continuous physical feature field raster, a three-dimensional array containing three physical feature channels. The server also decomposes this raster into three independent two-dimensional matrices: a static strain field map, a dominant frequency distribution field map, and an energy dissipation field map. Next, the server performs the core tensor stacking operation. Using numerical computation libraries such as NumPy or TensorFlow, the server concatenates the red, green, and blue color channel data matrices with the three physical feature field map matrices along the depth dimension. This operation merges the six two-dimensional matrices into a unified six-channel three-dimensional array, whose spatial dimensions are consistent with the real-time surface image. Finally, to eliminate differences in data units and numerical ranges between different channels, the server performs channel-by-channel normalization on this six-channel array.

[0061] For each channel, the server calculates the maximum and minimum values ​​of all pixels within that channel, and then applies a linear mapping function to scale all values ​​of that channel to a preset normalization range. After normalization of all six channels, the resulting data structure is the composite physical vision feature tensor, which is then transmitted to the deep learning inference module to prepare for subsequent feature mining.

[0062] In this embodiment, the normalization process employs the min-max normalization method, processing the data for each channel independently. For any pixel value in any channel... Its normalized value The calculation formula is as follows:

[0063] Among them, here These are the original pixel values ​​before normalization. It is the minimum value of that channel in the entire field plot or image. It is the maximum value of that channel across the entire field plot or image. This formula linearly maps the raw data to the interval [0, 1].

[0064] It should be noted that the red, green, and blue color channels are components of a standard digital image, typically represented by 8-bit unsigned integer values ​​ranging from 0 to 255. Physical feature channels refer to single physical quantity field maps decomposed from a continuous physical feature field raster; they are considered equivalent to color channels in terms of data structure. Depth dimension refers to the dimension perpendicular to the spatial plane in a multidimensional array representing an image or feature map; this dimension is used to index different feature channels. Pixel-level normalization is a crucial preprocessing step to ensure that the data input to the deep learning model has good distribution characteristics. It avoids some channels dominating the gradient calculation process due to their large values ​​by scaling all feature channel data to the same numerical range, such as [0, 1]. The composite physical visual feature tensor is a high-dimensional data structure where the value at each spatial location is no longer a simple color vector, but a six-dimensional feature vector containing color information and the physical state information of that point, constituting a comprehensive description of the structural state.

[0065] For example, continuing from the previous example, the system processes data located at pixel coordinates (960, 540). First, the RGB color value of this pixel is read from the real-time surface image as [128, 130, 135]. From the corresponding position in the continuous physical feature field grid, the physical feature vector is read as [185.7 με, 5.3 Hz, 16.1 με]. Next, the server stacks these two vectors along the depth dimension to form a six-dimensional unnormalized feature vector. Subsequently, normalization was performed. Based on statistics of the entire dataset or preset physical limits, the value ranges for each channel were determined: red channel [0, 255], green channel [0, 255], blue channel [0, 255], static strain channel [-500, 500] με, dominant frequency channel [1.0, 20.0] Hz, and energy dissipation channel [0, 50.0] με. The minimum-maximum normalization formula was then applied to... The calculations for each component are as follows: the normalized value for the red channel is (128-0) / (255-0)≈0.502; for the green channel it is (130-0) / (255-0)≈0.510; for the blue channel it is (135-0) / (255-0)≈0.529; for the static strain channel it is (185.7-(-500)) / (500-(-500))=685.7 / 1000=0.6857; for the main frequency channel it is (5.3-1.0) / (20.0-1.0)=4.3 / 19.0≈0.226; and for the energy dissipation channel it is (16.1-0) / (50.0-0)=0.322. Finally, the normalized feature vector corresponding to this pixel is [0.502, 0.510, 0.529, 0.6857, 0.226, 0.322]. This process is performed in parallel on all pixels, ultimately generating a complete 1920×1080×6 composite physical vision feature tensor.

[0066] In one embodiment of the present invention, the coupled feature mining module is used to perform the following steps: Multi-channel convolutional kernels are used to encode composite physical visual feature tensors to extract intermediate feature maps containing texture and physical gradients. A cross-attention mechanism is introduced to map features from color channel data to a query matrix and features from physical feature channels to a key matrix and a value matrix. The dot product of the query matrix and the key matrix is ​​calculated to obtain relevance weights, and the relevance weights are used to perform weighted summation on the value matrix to identify regions of interest with abnormal physical properties and sensitive visual features. The regions of interest are upsampled and projected to generate damage-sensitive feature maps.

[0067] Specifically, the deep learning inference module within the central data processing server receives the generated composite physical-visual feature tensor as input. This module is typically deployed on a graphics processing unit (GPU) to accelerate computation. The composite physical-visual feature tensor is then fed into the encoder section of a pre-defined convolutional neural architecture. This encoder consists of multiple stacked convolutional layers, activation function layers, and pooling layers. As the composite physical-visual feature tensor flows through the first convolutional layer, this layer utilizes a multi-channel convolutional kernel, such as a 3×3×6 filter, to simultaneously weight the data from all six channels. This convolutional kernel shares weights in the spatial dimension but has independent parameters in the channel dimension, enabling it to learn the coupling patterns between visual texture and physical fields within the same spatial region. For example, it can simultaneously capture the corresponding numerical jumps in the physical feature channels while perceiving linear discontinuities in pixel textures. This process couples visual and physical features at the lowest-level feature extraction stage.

[0068] Subsequently, after several layers of preliminary feature extraction, the generated intermediate feature maps are fed into the cross-attention mechanism module. In this module, the feature maps derived from the red, green, and blue color channels are linearly transformed into a query matrix. The feature maps originating from the three physical feature channels are linearly transformed into a bond matrix. Sum matrix By calculating the query matrix AND key matrix The dot product of the transpose of the matrix, after scaling and processing with a softmax activation function, generates a set of correlation weights. These weights reflect the strength of the association between the visual and physical features at each pixel location. This set of correlation weights is then used to adjust the value matrix. A weighted summation is performed to obtain an attention-enhanced feature map. This operation allows the network to focus on regions of interest that are physically anomalous and visually sensitive, while suppressing irrelevant or noise-induced artifact regions.

[0069] Finally, the attention-enhanced feature map continues to propagate deeper into the convolutional neural architecture. Through further convolution and pooling operations, the network extracts more abstract and high-level high-dimensional abstract features. These features are ultimately upsampled and projected onto a single-channel output map, where each pixel value represents the probability of structural damage at that location. This output map is the damage-sensitive feature map.

[0070] In this embodiment, the core calculation of the cross-attention mechanism can be represented by the scaled dot product attention formula:

[0071] in, It is a query matrix, composed of feature maps from the visual feature channels. Multiply by a learnable weight matrix get. It is a key matrix, derived from the feature maps of the physical feature channels. Multiply by the weight matrix get. It is a value matrix, by Multiply by the weight matrix get. is the dimension of the key vector, used to scale the dot product result to maintain gradient stability. The softmax function converts the computed relevance scores into a probability distribution ranging from 0 to 1, i.e., relevance weights. The output of this formula is a new feature map, in which features containing physical information are reweighted according to their relevance to visual features. In the attention formula, the input feature map... and All of them originate from the composite physical visual feature tensor.

[0072] It should be noted that the pre-defined convolutional neural architecture can be an encoder-decoder structure, such as U-Net or its variants, which excels at image segmentation and feature localization tasks. A shared convolutional kernel is a convolutional filter that operates on multi-channel input, with weights shared across all channels, enabling the learning of cross-channel feature combinations. Cross-attention is a neural network module that allows the model to dynamically reference information from another modality while processing information from one modality to compute the importance of features.

[0073] Regions of interest (ROIs) refer to spatial regions with high relevance weights calculated by the model through an attention mechanism within the composite physical visual feature tensor; these regions are considered potential damage points. High-dimensional abstract features refer to deeper semantic features extracted by neural networks after multiple nonlinear transformations, which cannot be directly correlated with the original input. Damage-sensitive feature maps are single-channel two-dimensional matrices with spatial dimensions consistent with the input image. The value of each pixel is typically between 0 and 1, representing the confidence or probability that the point belongs to a damage region.

[0074] For example, the deep learning inference module receives a 1920×1080×6 composite physical visual feature tensor generated in the previous step. When this tensor is input into the network, it is assumed that a tiny crack exists in a small region of the image, appearing as a dark line in the RGB channels, while the corresponding static strain field channel shows a sharp jump from 0.68 to 0.75. The trained 3×3×6 shared convolutional kernels are optimized to produce a high response to this combination of "dark line + strain jump" pattern, so after the convolution operation, the output feature map produces a high activation value at the corresponding location in the region. Subsequently, in the cross-attention module, a query vector representing the visual feature of the dark line in this region is used. With the bond vector representing the physical characteristics of strain jump Performing a dot product operation, due to the high correlation between the two, yields a large dot product value. After applying the softmax function, this position receives a correlation weight close to 1.0. This weight is used to strengthen the corresponding value vector. This significantly amplifies the features of the tiny crack region in the generated attention-enhanced feature map. Finally, after processing by all layers of the network, a 1920×1080×1 damage-sensitive feature map is output. In this map, most areas have pixel values ​​close to 0.0, but at the location of the tiny crack, a clear and bright area is formed with pixel values ​​above 0.9, accurately indicating the potential location and morphology of the damage.

[0075] In one embodiment of the present invention, the damage quantification identification module is used to perform the following steps: The damage-sensitive feature map is binarized and connected components are extracted. A skeletonization algorithm is used to extract the skeleton lines of the connected components, and the crack length is determined by accumulating the Euclidean physical distance between the center points of adjacent pixels on the skeleton lines. The total number of pixels in the connected components is counted and multiplied by the physical area represented by a single pixel to determine the damaged area. The average physical field value within the connected components is calculated by backtracking the continuous physical feature field grid to determine the stress response intensity. Based on the positions of crack length, damaged area, and stress response intensity in the structural health threshold matrix, the physical type and severity level of the damage are determined, and the structural damage identification result is generated by combining the three-dimensional spatial coordinates.

[0076] Specifically, the damage assessment module in the central data processing server quantifies and classifies the potential damage regions revealed by the damage-sensitive feature map. First, the damage assessment module receives the generated damage-sensitive feature map. This module performs semantic segmentation processing on the damage-sensitive feature map. Specifically, it applies a global thresholding process, marking regions with pixel values ​​higher than a preset damage confidence threshold as potential damage regions, setting their pixel values ​​to 1, and setting the remaining pixel values ​​to 0, thereby classifying the damage-sensitive feature map as a potential damage region. Figure 2 Next, the module identifies and extracts all independent connected regions in the binarized image using a connected component analysis algorithm. Each connected region represents a potential damaged object. For each identified damaged object, the module further extracts its geometric parameters. To obtain the crack length, if the geometry of the connected region is elongated, the module simplifies it to a skeleton line with a single pixel width using a skeletonization algorithm. The module traverses the pixel sequence on the skeleton line, calculates the Euclidean distance in physical space for each pair of adjacent pixels, and accumulates the results to obtain the accurate crack length. For all types of damaged objects, the damaged area is calculated by counting the number of pixels within the connected region and multiplying it by the physical area represented by a single pixel.

[0077] Simultaneously, to extract the stress response intensity, the module backtracks to the continuous physical feature field grid. Based on the pixel coordinate range of the current damaged object in the image, it calculates the average value of the static strain field map and energy dissipation field map within that range, and uses this average value as the stress response intensity of the damaged object. Subsequently, the damage assessment module compares the extracted geometric parameters—crack length, damaged area, and stress response intensity—with a preset structural health threshold matrix. This matrix stores parameter ranges or discrete thresholds for different damage types and severity levels. The module first determines the physical type of damage through a series of conditional statements. For example, if the crack length is much larger than the damaged area, it is determined to be a crack; if the damaged area is large and irregularly shaped, combined with a high average value of the energy dissipation field map, it is determined to be material spalling; if a high gradient change appears locally in the static strain field map, it is determined to be local deformation.

[0078] Once the physical type of damage is determined, the module further classifies the severity of the damage, such as minor, moderate, or severe, based on whether the values ​​of crack length, damaged area, and stress response intensity fall within the range set in the structural health threshold matrix. Finally, the damage assessment module combines the established three-dimensional spatial coordinates to mark the physical location of the damage. For each identified damaged object, the module first determines its pixel centroid or the center point coordinates of its circumscribed rectangle in the real-time surface image. Then, using the established projective transformation relationship between image pixel coordinates and three-dimensional spatial coordinates, the two-dimensional pixel coordinates are inversely mapped to the actual three-dimensional physical coordinates of the building structure. All this information, including the three-dimensional spatial coordinates of the damage on the building structure, the quantified crack length and damaged area, the determined physical type of damage, and the classified severity level, is integrated into a structured data record, generating a structural damage identification result, which is then written into the system database.

[0079] The damage-sensitive feature map is a single-channel floating-point image with pixel values ​​ranging from 0 to 1, representing the probability or confidence of damage occurrence. Semantic segmentation is an image processing technique designed to assign each pixel in an image to a predefined category, used here to distinguish damaged and non-damaged regions.

[0080] The damage confidence threshold is a floating-point number between 0.5 and 0.9. Pixels with a confidence level higher than this value are considered damaged. The specific value is set based on historical verification data and expert experience, for example, 0.8. Crack length is in millimeters. This parameter may not be applicable to non-linear damage.

[0081] The damaged area is measured in square millimeters. Stress response intensity is characterized by extracting the average or maximum value from the original physical field diagram; its unit depends on the chosen physical quantity, such as microstrain με or Hertz (Hz). The structural health threshold matrix is ​​a multidimensional lookup table or rule set containing different damage types, such as cracks, voids, spalling, and deformation, and severity levels, such as minor, moderate, severe, and critical; corresponding numerical ranges or discrete thresholds for crack length, damaged area, and stress response intensity. This matrix is ​​constructed based on industry standards, domain expert experience, and statistical analysis of historical monitoring data for specific building structures.

[0082] The physical type of damage refers to the category of the identified damage in terms of its physical morphology or mechanical performance. The severity level is a graded assessment of the damage's risk. The structural damage identification results are structured data packages containing a comprehensive description of each individual damage event, facilitating system recording, visualization, and further analysis.

[0083] For example, continuing from the previous step's output, the damage assessment module receives a 1920×1080 damage-sensitive feature map, where pixel values ​​are all higher than 0.9 within the pixel coordinate range [(950, 530), (970, 550)]. The server sets the damage confidence threshold to 0.85 and binarizes the damage-sensitive feature map to form a connected component. Next, the module determines that this connected component is elongated and performs a skeletonization algorithm to obtain a skeleton line composed of 20 pixels. The module extracts the three-dimensional coordinates of adjacent pixels on this skeleton line in physical space, accumulates the Euclidean distance between adjacent points, and calculates the precise crack length as 10.4 mm, considering, for example, the diagonal pixel spacing, rather than simply accumulating pixel values. The total number of pixels in this connected component is 40, so the damaged area is 40 × (0.5 mm × 0.5 mm) = 10.0 mm². The module queries the continuous physical feature field grid. Within this pixel region, the average value of the static strain field map is 260.0 με, and the average value of the energy dissipation field map is 18.0 με. Therefore, the stress response intensity of this damage can be expressed as {static strain: 260.0 με, energy dissipation: 18.0 με}.

[0084] The module then compares these geometric parameters with a structural health threshold matrix. The structural health threshold matrix defines crack type as follows: crack length greater than 5 mm and damaged area less than 20 mm². Under this condition, the damage is classified as a physical type crack. Further, the threshold matrix defines a minor crack level as: crack length between 5 mm and 15 mm and static strain response less than 300 με. According to this definition, a length of 10.0 mm and a static strain response of 260.0 με both meet the minor level range; therefore, the crack is classified as minor severity. Finally, the module inversely maps the pixel centroid (960, 540) to three-dimensional spatial coordinates (55.0, 16.0, 30.5) meters on the building structure using a projective transformation. Finally, the structural damage identification results were generated as follows: {Damage location: (55.0, 16.0, 30.5) m, Crack length: 10.4 mm, Damaged area: 10.0 mm², Stress response intensity: {Static strain: 260.0 με, Energy dissipation: 18.0 με}, Damage type: Crack, Critical level: Minor}. This result was recorded and stored.

[0085] In one embodiment of the present invention, a time-series evolution prediction module is further included, which is used to perform the following steps: Obtain structural damage identification results from multiple consecutive sampling periods and construct a time-series dataset of damage geometric parameters; input the time-series dataset into a long short-term memory network to predict damage geometric parameters at future time steps; and generate a predictive assessment report containing damage evolution trends based on the prediction results.

[0086] Generating predictive assessment reports also includes: Calculate the slope of the damage geometry parameters over time; query the structural limit state database to obtain the critical threshold based on the current damage physical type; use the slope and the critical threshold to calculate the critical time point when the damage expands to the critical state through linear extrapolation, and write it into the predictive assessment report. Alternatively, based on the three-dimensional spatial coordinates and damage geometric parameters in the structural damage identification results, the building information model can be locally meshed to generate a finite element calculation model; a preset specific load condition can be applied to solve the finite element calculation model; the stress redistribution state of the damaged area and its surroundings can be simulated, and the simulation result map can be written into the predictive assessment report as the basis for risk assessment.

[0087] Specifically, the predictive analytics module in the central data processing server predicts future damage trends based on historical monitoring data. First, the predictive analytics module periodically queries the system database, extracting a series of structural damage identification results generated over multiple consecutive sampling periods for the same identified damaged object. From these historical results, the module extracts key geometric parameters, particularly crack length and damaged area, forming one or more time-series datasets. Next, this time-series dataset is input into a pre-trained Long Short-Term Memory (LSTM) network model. This network, through its internal memory units and gating mechanisms, learns and captures the nonlinear dynamic patterns of geometric parameters changing over time.

[0088] After processing the input historical sequence, the network predicts geometric parameter values ​​for one or more future time steps. Based on the network's prediction output, the module calculates the ratio of the difference in geometric parameters between the current and future times to the time interval, obtaining the slope of the geometric parameters over time. This slope quantifies the rate of damage propagation. Subsequently, the module accesses a pre-set structural limit state database and, based on the physical type of the current damage and the material properties of the component, queries and obtains the corresponding critical damage threshold, such as the maximum allowable crack length. Using the currently measured geometric parameter values, the previously calculated slope, and the critical threshold, the module calculates the time required for damage to propagate to the critical state using linear extrapolation; this time is the critical time point for damage propagation. Simultaneously, to assess the impact of damage development on the local load-bearing capacity of the structure, the module initiates a structural analysis subroutine. This subroutine, based on the precise three-dimensional spatial coordinates of the current damage and the quantified geometric parameters, performs local meshing and correction on the building information model, generating a finite element model reflecting the current damage state.

[0089] Then, the module retrieves a pre-defined specific load case from the load case library, applies it to the calculation model, and performs a fast finite element method (FEM) solution to simulate the stress redistribution state of the damaged area and its surroundings under that load case. Finally, the predictive analysis module summarizes all analysis results, including the latest recorded structural damage identification results, the calculated slope of change, the predicted critical time point, and the analysis diagram of the simulated stress redistribution state. The module integrates and formats this information to generate a structured predictive assessment report. After the report is generated, the pre-defined rule engine evaluates key indicators in the report, such as whether the critical time point is less than the warning time window. If the warning conditions are met, a warning command will be automatically generated and issued to the designated maintenance management platform.

[0090] In this embodiment, the critical time point for damage propagation The calculation is based on a predictive extrapolation process. Using crack length... For example, the calculation formula is as follows:

[0091] in, It is the critical crack length obtained from the structural limit state database. It is the latest measured or predicted crack length. It is the slope of change calculated from the prediction results of the Long Short-Term Memory Network, that is, the average rate of expansion of the crack length.

[0092] It should be noted that Long Short-Term Memory (LSTM) networks are a special type of recurrent neural network suitable for processing and predicting long-term dependencies in time-series data. The slope of change is a quantitative indicator characterizing the rate of damage development, measured in mm / day or mm² / day. The critical time point is a predicted value, indicating the remaining safe service time of the structure at the current rate of damage development. Specific load cases are typical or extreme load conditions defined according to structural design specifications and actual operating environments, such as a once-in-a-century wind load or a full-load traffic load. The stress redistribution state is a numerical simulation result, showing, in the form of a contour map or data matrix, how the internal stress of the structure deviates from the normal distribution due to damage, specifically indicating potential new stress concentration points at and around the damage tip. The predictive assessment report is a comprehensive data document, typically generated in XML or JSON format, containing a comprehensive assessment of the current damage status, development trends, and potential risks. An early warning instruction is a message sent to maintenance personnel or automated management systems, including detailed damage information and recommended actions, such as suggesting a detailed inspection within XX days.

[0093] For example, following the previous step, the system has continuously monitored the crack at (55.0, 16.0, 30.5) meters for one week. The database records the daily measurements of this crack length, forming a time series: [10.0, 10.1, 10.3, 10.4, 10.5, 10.6, 10.7] mm. The predictive analysis module inputs this series into a long short-term memory network, which predicts that the crack length will reach 10.8 mm the following day. Based on this prediction, the module calculates the slope of change. The module then queries the structural limit state database to determine the critical crack length of the component. The current crack length is 30.0 mm. The value is 10.7 mm. The module is substituted into the formula to calculate the critical time point: Simultaneously, stress simulation was initiated. Under a specific load condition applying the maximum design wind load, the simulation results showed that the stress concentration factor at the crack tip reached 2.5, causing the local stress to approach the material's fatigue limit. Finally, a predictive assessment report was generated, with the summary: "The crack at location (55.0, 16.0, 30.5) is currently 10.7 mm long, classified as 'slight,' with a predicted propagation rate of 0.1 mm / day, and is expected to reach critical length in 193 days. Under extreme wind loads, there is a high risk of stress concentration at this location." Due to... If the warning period exceeds the preset 90-day warning window, it is determined to be a long-term concern item. Instead of triggering an emergency warning command, a routine maintenance suggestion record is generated.

[0094] Each of the modules can be implemented in whole or in part through software, hardware, or a combination thereof. It supports hardware embedded in or independent of the processor in the computer device, and also supports software stored in the memory of the computer device, so that the processor can call and execute the operations corresponding to each of the above modules.

[0095] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should all be included within the protection scope of the present invention.

Claims

1. A building structure damage identification and assessment system integrating multi-point strain perception and AI, characterized in that, include: The data synchronization and alignment module acquires the original strain sequence signals with a predetermined sampling frequency collected by strain sensors distributed at the preset sensor deployment nodes of the building structure, the real-time temperature data collected by temperature sensors, and the real-time surface images acquired by the vision acquisition terminal. It establishes a mapping relationship between the image pixel coordinates and the three-dimensional spatial coordinates of the strain sensors, and performs data alignment based on the timestamp to generate an initial spatiotemporal dataset. The dynamic strain feature extraction module performs sliding window framing and time-frequency domain analysis on the original strain sequence signal with a predetermined sampling frequency in the initial spatiotemporal dataset to extract a multi-dimensional dynamic strain feature vector that reflects the local stiffness and energy characteristics of the structure. The physical feature field construction module uses three-dimensional spatial coordinates as constraints to perform spatial interpolation on multi-dimensional dynamic strain feature vectors and construct a continuous physical feature field grid with the same resolution as the real-time surface image. The cross-modal data injection module extracts the color channel data of the real-time surface image, stacks and normalizes it with the physical feature channels contained in the continuous physical feature field grid in the depth dimension, and generates a composite physical visual feature tensor. The coupled feature mining module inputs a composite physical visual feature tensor into a deep feature extraction network to mine the correlation features between physical properties and visual texture, and generate a damage-sensitive feature map that reflects the location and shape of the damage. The damage quantitative identification module performs semantic segmentation on the damage-sensitive feature map to extract damage geometric parameters, compares the damage geometric parameters with a preset structural health threshold matrix, and outputs the structural damage identification results.

2. The building structure damage identification and assessment system integrating multi-point strain sensing and AI according to claim 1, characterized in that, The data synchronization and alignment module is used to perform the following steps: Perform intrinsic parameter calibration and distortion correction on the visual acquisition terminal; The scale-invariant feature transform algorithm is used to extract feature matching point pairs between the corrected real-time surface image and the building information model rendered view. Based on feature matching point pairs, the PnP algorithm is used to solve the camera extrinsic parameters and construct the projective transformation matrix; By using a projective transformation matrix, the three-dimensional spatial coordinates in the building information model are projected onto the pixel plane of the real-time surface image, thus achieving registration between physical space and pixel space.

3. The building structure damage identification and assessment system integrating multi-point strain sensing and AI according to claim 1, characterized in that, The dynamic strain feature extraction module is used to perform the following operations: Acquire real-time temperature data from sensor deployment points and perform temperature compensation correction on signals within each sampling frame; Calculate the arithmetic mean of the corrected signal within each sampling frame to obtain the static strain component; The dominant frequency of each sampling frame is extracted using Fast Fourier Transform; The energy components of each frequency band are calculated by wavelet packet decomposition and summed to obtain the root mean square value of dynamic strain; Calculate the rate of change of the dominant frequency with time to obtain the frequency deviation characteristics; The static strain components, dominant frequency, root mean square value of dynamic strain, and frequency offset characteristics are vectorized and concatenated to generate a multi-dimensional dynamic strain feature vector.

4. The building structure damage identification and assessment system integrating multi-point strain sensing and AI according to claim 1, characterized in that, The physical feature field construction module is used to perform the following operations: The multi-dimensional dynamic strain feature vector is separated into independent feature component datasets; The inverse transformation of the projective transformation matrix is ​​used to construct a ray pointing from the pixel to the physical space. The intersection of the ray and the geometric surface of the building information model is solved by the ray tracing algorithm to determine the unique three-dimensional spatial coordinates of each pixel. Retrieve structural topology data from the building information model to determine whether pixel coordinates and sensor deployment points belong to the same structural component. Under structural topology constraints, the radial basis function interpolation algorithm is used to calculate the interpolation feature value at the unique three-dimensional spatial coordinate of the pixel, with the three-dimensional spatial coordinate of the sensor in the same component as the control point. The static strain field map, the dominant frequency distribution field map, and the energy dissipation field map are generated separately, and these three are integrated in the depth direction to generate a continuous physical feature field grid.

5. The building structure damage identification and assessment system integrating multi-point strain sensing and AI according to claim 1, characterized in that, The cross-modal data injection module is used to perform the following steps: Decompose the real-time surface image to obtain data for the three color channels: red, green, and blue. Static strain field map, dominant frequency distribution field map, and energy dissipation field map are used as three physical characteristic channels; The red, green, and blue color channel data are stacked with the three physical feature channels to form a six-channel data set; The min-max normalization method is used to normalize the six-channel data at the pixel level for each channel, generating a composite physical vision feature tensor.

6. The building structure damage identification and assessment system integrating multi-point strain sensing and AI according to claim 1, characterized in that, The coupled feature mining module is used to perform the following steps: Multi-channel convolutional kernels are used to encode composite physical visual feature tensors, and intermediate feature maps containing texture and physical gradients are extracted. A cross-attention mechanism is introduced to map features from color channel data to a query matrix and features from physical feature channels to a key matrix and a value matrix. The dot product of the query matrix and the key matrix is ​​calculated to obtain the relevance weights. The relevance weights are then used to perform a weighted summation of the value matrix to identify regions of interest that are physically abnormal and visually sensitive. The region of interest is upsampled and projected to generate a damage-sensitive feature map.

7. The building structure damage identification and assessment system integrating multi-point strain sensing and AI according to claim 1, characterized in that, The damage quantification identification module is used to perform the following operations: The damage-sensitive feature map is binarized and connected components are extracted; The skeletonization algorithm is used to extract the skeleton lines of connected components, and the Euclidean physical distance between the center points of adjacent pixels on the skeleton lines is accumulated to determine the crack length. The total number of pixels in the connected components is counted and multiplied by the physical area represented by a single pixel to determine the damaged area; The mean value of the physical field within the connected domain is calculated by backtracking the continuous physical feature field grid to determine the stress response intensity; Based on the positions of crack length, damaged area, and stress response intensity in the structural health threshold matrix, the physical type and severity level of damage are determined, and structural damage identification results are generated by combining three-dimensional spatial coordinates.

8. The building structure damage identification and assessment system integrating multi-point strain sensing and AI according to claim 1, characterized in that, It also includes a time-series evolution prediction module, which performs the following steps: Obtain structural damage identification results from multiple consecutive sampling periods and construct a time-series dataset of damage geometric parameters; Input the time series dataset into the long short-term memory network to predict the damage geometry parameters at future time steps; A predictive assessment report containing damage evolution trends is generated based on the prediction results.

9. The building structure damage identification and assessment system integrating multi-point strain sensing and AI according to claim 8, characterized in that, Generating predictive assessment reports also includes: Calculate the slope of the change in damage geometry parameters over time; The critical threshold is obtained by querying the structural limit state database based on the current damage physical type. Using the slope of change and the critical threshold, the critical time point at which the damage expands to the critical state is calculated by linear extrapolation and written into the predictive assessment report.

10. The building structure damage identification and assessment system integrating multi-point strain sensing and AI according to claim 8, characterized in that, Based on the three-dimensional spatial coordinates and damage geometric parameters in the structural damage identification results, the building information model is locally meshed to generate a finite element calculation model. Apply a preset specific load condition and solve the finite element calculation model; The stress redistribution state of the damaged area and its surroundings is simulated, and the simulation results are used as a basis for risk assessment and written into the predictive assessment report.