Highway bridge monitoring and early warning intelligent agent system based on multi-modal large model

By fusing multimodal large-scale models with multi-source data for bridge condition monitoring and early warning, the problems of insufficient data utilization and rigid prediction models in existing technologies have been solved. This has enabled intelligent monitoring and early warning of bridge operation status, improving prediction accuracy and risk assessment accuracy.

CN122243404APending Publication Date: 2026-06-19ANHUI TRANSPORT CONSULTING & DESIGN INST

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ANHUI TRANSPORT CONSULTING & DESIGN INST
Filing Date
2026-03-20
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing bridge monitoring and early warning technologies struggle to fully utilize multi-source information, especially unstructured text data, resulting in limited understanding of bridge operational status. Furthermore, the selection of prediction models is rigid, leading to delayed risk warnings.

Method used

A highway bridge monitoring and early warning intelligent agent system based on a multimodal large model is adopted, which integrates real-time monitoring time series data, bridge structural design documents and inspection reports, and performs joint inference through an improved PaLI architecture large model to achieve adaptive scheduling of prediction tasks and multi-level early warning.

Benefits of technology

It has improved the completeness and consistency of bridge operation information, enhanced prediction accuracy and risk assessment accuracy, realized the transformation from passive alarm to proactive early warning, and enhanced the scientific nature and timeliness of bridge operation and maintenance management.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122243404A_ABST
    Figure CN122243404A_ABST
Patent Text Reader

Abstract

This invention discloses a highway bridge monitoring and early warning intelligent agent system based on a multimodal large model, comprising: a data acquisition and preprocessing module for acquiring and preprocessing multi-source heterogeneous data of the target highway bridge; a context construction and prompt word generation module for constructing contextual information of the target highway bridge's operating status and generating task prompt words; a multimodal reasoning module for using an improved PaLI architecture large model for reasoning and generating prediction task requirement information; an algorithm scheduling module for scheduling the target time series prediction algorithm; a time series prediction module for generating prediction results of the bridge structural status; a risk assessment module for generating bridge operation risk assessment results; and an early warning output module for outputting corresponding multi-level early warning information, early warning situation reports, and response plans, thereby realizing intelligent perception, predictive analysis, and proactive early warning of the highway bridge's operating status.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of highway bridge operation monitoring and safety early warning, and in particular to a highway bridge monitoring and early warning intelligent system based on a multimodal large model. Background Technology

[0002] As a crucial component of transportation infrastructure, the operational safety of highway bridges directly impacts the stability of the transportation system and public safety. With the increasing service life of bridges and the long-term effects of traffic loads and environmental factors, bridge structures inevitably experience varying degrees of performance degradation and safety hazards during operation. Therefore, how to continuously and accurately monitor the operational status of bridges and issue timely warnings before risks occur has become a critical technical issue that urgently needs to be addressed in the field of highway bridge operation and maintenance management.

[0003] Existing bridge monitoring and early warning technologies primarily rely on bridge structural health monitoring systems. These systems deploy sensors on key bridge components to collect data such as strain, displacement, acceleration, and temperature, and then use threshold judgment or statistical analysis methods to assess the bridge's structural condition. However, these technologies typically analyze data from only a single or limited number of sensors, making it difficult to fully utilize the multi-source information generated during bridge operation. In particular, they struggle to effectively integrate unstructured text data such as structural design documents, historical maintenance records, and inspection reports, resulting in significant limitations in understanding the bridge's operational status. Summary of the Invention

[0004] One objective of this invention is to propose an intelligent agent system for highway bridge monitoring and early warning based on a multimodal large model. This invention fully integrates real-time monitoring time-series data, bridge structural design documents, historical maintenance records, and inspection reports. It introduces an improved PaLI architecture large model to perform joint reasoning on bridge operation scenarios, structural component characteristics, and degradation stages. Based on this, it achieves intelligent generation of prediction task requirements and adaptive scheduling of time-series prediction algorithms, constructing an intelligent agent system covering bridge operation monitoring, state prediction, risk assessment, and early warning issuance. It enables on-demand selection and dynamic application of prediction models, possessing advantages such as strong monitoring and perception capabilities, high prediction accuracy, high risk assessment accuracy, and timely early warning response.

[0005] The highway bridge monitoring and early warning intelligent agent system based on a multimodal large model according to an embodiment of the present invention includes: The data acquisition and preprocessing module is used to acquire and preprocess multi-source heterogeneous data of the target highway bridge. The context construction and prompt word generation module is used to construct context information of the target highway bridge's operating status based on preprocessed multi-source heterogeneous data, and generate task prompt words based on the context information; The multimodal reasoning module is used to input contextual information and task prompts into the improved PaLI architecture large model to reason about the bridge operation scenario, structural component characteristics and degradation stage, and generate prediction task requirement information. The algorithm scheduling module is used to automatically schedule target time series prediction algorithms that match the current prediction task from the time series prediction algorithm library based on the prediction task requirement information. The time series prediction module is used to predict real-time monitoring time series data based on the target time series prediction algorithm and generate prediction results of bridge structural status. The risk assessment module is used to generate bridge operation risk assessment results based on the prediction results and the inference output of the improved PaLI architecture large model. The early warning output module is used to output corresponding multi-level early warning information, early warning status reports and response plans based on the bridge operation risk assessment results.

[0006] Optionally, modules can be integrated using the following methods: Collect and preprocess multi-source heterogeneous data of the target highway bridge; Based on the preprocessed multi-source heterogeneous data, contextual information of the target highway bridge's operating status is constructed, and corresponding task prompt words are generated according to the contextual information. By inputting contextual information and task prompts into the improved PaLI architecture model, reasoning is performed on the bridge operation scenario, structural component characteristics and degradation stage to obtain the predicted task requirement information. Based on the prediction task requirements, the decision-making center of the monitoring and early warning intelligent agent automatically schedules the target time series prediction algorithm that matches the current prediction task from the time series prediction algorithm library; The bridge structural state is predicted by using the target time series prediction algorithm to predict the real-time monitoring time series data. Based on the prediction results, combined with the inference output of the improved PaLI architecture large model, the bridge operation risk assessment results are generated. Based on the bridge operation risk assessment results, corresponding multi-level early warning information, early warning status reports and response plans are output.

[0007] Optionally, the multi-source heterogeneous data includes real-time monitoring time-series data, bridge structure design documents, historical maintenance records, and daily inspection report texts. The preprocessing includes time alignment, missing data handling, outlier handling, filtering, text cleaning, OCR text recognition, and standardization processing.

[0008] Optionally, the construction of the context information and the generation of task prompts specifically include: Based on the preprocessed multi-source heterogeneous data, the data is divided into sensor monitoring time series data set, text data set and structural description data set according to the data source. The sensor monitoring time series data set includes time series data formed by sensor channels. The text data set includes inspection reports, historical maintenance records and structural design documents. The structural description data set includes bridge component list and component hierarchy relationship data. The sensor monitoring time series data set, text data set and structural description data set are uniformly encapsulated to construct a data entry set, and each data entry in the data entry set is bound with a bridge identifier, component identifier, acquisition time identifier and data source type identifier; The data entries corresponding to the sensor monitoring time series data set are divided into time windows to generate a time series segment set; The data entries corresponding to the text data set are structured to generate a text structured field set, and the data entries corresponding to the structure description data set are standardized to generate a structure semantic field set. Both the text structured field set and the structure semantic field set are associated with the corresponding bridge identifier and component identifier. Using bridge identifiers and component identifiers as association keys, the time sequence fragment set, text structured field set, and structural semantic field set are matched and associated. For each component, a component context entry is formed. The component context entry includes the time sequence fragment corresponding to the component, the inspection and maintenance event field associated with the component, and the structural semantic field of the component. The context entries are organized in chronological order to form context information. Based on the context information, task prompt words are generated. The task prompt words include the prediction object, prediction index, prediction time span, prediction time resolution, and prediction output format.

[0009] Optionally, obtaining the prediction task requirement information specifically includes: The context information and task prompts are input into the improved PaLI architecture model, which includes a structural semantic injection module, a temporal-text joint encoding module, and a prediction task reasoning and generation module. The structural semantic injection module maps the set of structural semantic fields in the context information to generate a structural semantic representation, thus obtaining structurally enhanced context information. The temporal-text joint encoding module jointly encodes the structurally enhanced context information and task prompts, aligns the set of temporal segments with the set of text structured fields across modalities, and generates a unified multimodal representation. The prediction task reasoning and generation module introduces a task reasoning mechanism with degradation stage constraints to perform task reasoning on the unified multimodal representation and obtain prediction task requirement information. In the structural semantic injection module, a structural semantic representation is generated based on the set of structural semantic fields in the context information to obtain structural enhanced context information. The generation process involves mapping the bridge component type information, component hierarchical relationship information, and component association relationship information in the set of structural semantic fields to a structural semantic representation, and then fusing the structural semantic representation with the context information to generate structural enhanced context information. In the temporal-text joint encoding module, structurally enhanced contextual information and task prompt words are jointly encoded, and the temporal segment set and the text structured field set are cross-modal aligned to generate a unified multimodal representation; In the prediction task reasoning and generation module, key information related to bridge components and monitoring indicators is extracted based on the unified multimodal representation to obtain candidate prediction objects and candidate prediction indicators. A task reasoning mechanism with degradation stage constraints is introduced to perform task reasoning on the unified multimodal representation to obtain degradation stage identifiers corresponding to bridge components. The candidate prediction objects and candidate prediction indicators are screened, and prediction objects and prediction indicators consistent with the degradation stage identifiers are selected. Based on the selected prediction objects and indicators, the corresponding prediction time span and prediction time resolution are obtained. The prediction objects, prediction indicators, prediction time span, prediction time resolution, and degradation stage identifiers are organized into prediction task requirement information.

[0010] Optionally, the target time series prediction algorithm is obtained by specifically including: The prediction task requirement information is input into the decision center of the monitoring and early warning intelligent agent, and the constraints for algorithm scheduling are obtained from the time series prediction algorithm library. The time series prediction algorithm library includes statistical models, machine learning models and deep learning models. The constraints include model category, input data type, prediction time span, prediction time resolution and degradation stage identifier. Candidate time series prediction algorithms are obtained from the time series prediction algorithm library. Based on the constraints of algorithm scheduling, the candidate time series prediction algorithms are matched and selected to obtain the target time series prediction algorithm that meets the prediction time span, prediction time resolution and degradation stage identifier.

[0011] Optionally, obtaining the predicted results of the bridge structural state specifically includes: Based on the prediction target and prediction indicators, corresponding monitoring data are collected from real-time monitoring time series data and formed into a target monitoring sequence in chronological order; Based on the predicted temporal resolution, the target monitoring sequence is time-aligned to form an aligned monitoring sequence; Based on the input window configuration corresponding to the target time series prediction algorithm, the algorithm input sequence for prediction calculation is extracted from the aligned monitoring sequence; The prediction step number is obtained based on the prediction time span and prediction time resolution. The algorithm input sequence is then input into the target time series prediction algorithm to generate a prediction sequence according to the prediction step number and prediction time resolution. The predicted sequence is associated with the predicted object to form the predicted result of the bridge structure state. The predicted result of the bridge structure state includes the predicted object, the predicted index, the predicted time sequence and the corresponding predicted value sequence.

[0012] Optionally, the generation of the bridge operation risk assessment results specifically includes: The inference input is constructed by the predicted object, predicted index, predicted time sequence and corresponding predicted value sequence in the prediction results of the bridge structural state. The improved PaLI architecture large model performs joint inference on the bridge operation state and outputs the inference results of the bridge operation scenario, the inference results of the structural component features and the inference results of the degradation stage corresponding to the predicted object. Based on the inference results of structural component characteristics and degradation stage, the predicted value sequence is subjected to risk weighting processing to generate a risk score sequence that corresponds one-to-one with the predicted time sequence. The risk score sequence refers to the operational risk level of the predicted object at the corresponding predicted time. Based on the inference results of the bridge operation scenario, the risk score sequence is modified according to the scenario to generate the modified risk score sequence. The revised risk score sequence is correlated and summarized with the prediction object, prediction index, prediction time sequence and corresponding prediction value sequence to generate the bridge operation risk assessment result.

[0013] Optionally, the output of the corresponding multi-level early warning information, early warning situation report, and response plan specifically includes: Based on the preset risk level classification rules, the risk score sequence in the risk assessment results is mapped to a level to generate a risk level sequence that corresponds one-to-one with the predicted time sequence. Based on the risk level sequence and the prediction time sequence, multi-level early warning information is generated. The multi-level early warning information includes the early warning level and time distribution of the predicted object at each prediction time in the prediction time sequence. The warning levels and risk scores corresponding to each prediction time are summarized in chronological order according to the prediction time sequence to form the situation data of risk changing over time, and a warning situation report is generated based on the situation data. Based on the warning level corresponding to each predicted time in the multi-level warning information, and according to the warning level, the corresponding response measures are obtained to form a response plan that corresponds one-to-one with the warning level.

[0014] The beneficial effects of this invention are: The intelligent agent system for highway bridge monitoring and early warning based on a multimodal large model proposed in this invention can effectively solve the problem of the difficulty in unified organization and comprehensive utilization of multi-source heterogeneous data in existing bridge monitoring and early warning technologies. This invention performs unified preprocessing and context construction on multi-source heterogeneous data such as real-time monitoring time-series data, bridge structural design documents, historical maintenance records, and inspection report texts. Data from different sources and in different forms are fused using bridge identifiers and component identifiers as association keys to form structured, time-seriesd bridge operation status context information with clear semantic relationships. This significantly improves the completeness and consistency of bridge operation information expression, providing a reliable data foundation for subsequent reasoning and prediction.

[0015] This invention introduces an improved PaLI architecture large model, which jointly models bridge operation status context information with task prompts to achieve multimodal joint reasoning on bridge operation scenarios, structural component characteristics, and degradation stages. Compared to traditional methods that rely on a single data source or rule-driven approaches, this invention can comprehensively utilize time-series monitoring data and textual semantic information in a unified multimodal representation space to automatically identify prediction objects, prediction indicators, and prediction needs that match the current bridge operation status. This effectively reduces reliance on human experience and improves the intelligence and adaptability of prediction task generation.

[0016] In the predictive analysis phase, this invention monitors the decision-making center of the early warning agent and automatically schedules target time-series prediction algorithms that match the current task from the time-series prediction algorithm library based on the prediction task requirements. This avoids the limitations of fixed prediction models or manual selection in existing technologies. This approach can flexibly select appropriate prediction models for prediction calculations based on different bridge components, different monitoring indicators, and different degradation stages, thereby improving the accuracy and stability of bridge structural condition prediction results and providing a more reliable predictive basis for risk assessment.

[0017] Furthermore, this invention integrates time-series prediction results with the inference output of a multimodal large model to construct a risk weighting and scenario correction mechanism. This comprehensive assessment of bridge operational risks more accurately reflects the actual risk level of bridges under different operational scenarios and degradation states. Based on the risk assessment results, this invention enables the generation of multi-level early warning information, the construction of early warning situation reports, and the output of contingency plans. This transforms bridge monitoring and early warning from a traditional passive alarm mode to a proactive early warning mode with predictability and foresight, contributing to improved scientific rigor and timeliness in bridge operation and maintenance management.

[0018] In summary, this invention achieves intelligent monitoring, prediction, and early warning of bridge operation status through the synergistic application of multi-source heterogeneous data fusion, multi-modal large model joint inference, and intelligent algorithm scheduling mechanism. It effectively overcomes the shortcomings of existing technologies in terms of insufficient data fusion capabilities, inflexible prediction task generation, and disconnect between risk assessment and early warning, and has high engineering application value and promotion significance. Attached Figure Description

[0019] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings: Figure 1 This is an overall flowchart of the intelligent system for monitoring and early warning of highway bridges based on a multimodal large model proposed in this invention; Figure 2 This is a schematic diagram of the context information and task prompt words of the intelligent agent system for monitoring and early warning of highway bridges based on a multimodal large model proposed in this invention; Figure 3 This is a schematic diagram illustrating the construction of the improved PaLI architecture large model of the highway bridge monitoring and early warning intelligent agent system based on a multimodal large model proposed in this invention. Detailed Implementation

[0020] The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic diagrams, illustrating only the basic structure of the invention, and therefore only show the components relevant to the invention.

[0021] refer to Figures 1-3 A highway bridge monitoring and early warning intelligent agent system based on a multimodal large model includes: The data acquisition and preprocessing module is used to acquire and preprocess multi-source heterogeneous data of the target highway bridge. The context construction and prompt word generation module is used to construct context information of the target highway bridge's operating status based on preprocessed multi-source heterogeneous data, and generate task prompt words based on the context information; The multimodal reasoning module is used to input contextual information and task prompts into the improved PaLI architecture large model to reason about the bridge operation scenario, structural component characteristics and degradation stage, and generate prediction task requirement information. The algorithm scheduling module is used to automatically schedule target time series prediction algorithms that match the current prediction task from the time series prediction algorithm library based on the prediction task requirement information. The time series prediction module is used to predict real-time monitoring time series data based on the target time series prediction algorithm and generate prediction results of bridge structural status. The risk assessment module is used to generate bridge operation risk assessment results based on the prediction results and the inference output of the improved PaLI architecture large model. The early warning output module is used to output corresponding multi-level early warning information, early warning status reports and response plans based on the bridge operation risk assessment results.

[0022] In this embodiment, the modules are connected through the following method: Collect and preprocess multi-source heterogeneous data of the target highway bridge; Based on the preprocessed multi-source heterogeneous data, contextual information of the target highway bridge's operating status is constructed, and corresponding task prompt words are generated according to the contextual information. By inputting contextual information and task prompts into the improved PaLI architecture model, reasoning is performed on the bridge operation scenario, structural component characteristics and degradation stage to obtain the predicted task requirement information. Based on the prediction task requirements, the decision-making center of the monitoring and early warning intelligent agent automatically schedules the target time series prediction algorithm that matches the current prediction task from the time series prediction algorithm library; The bridge structural state is predicted by using the target time series prediction algorithm to predict the real-time monitoring time series data. Based on the prediction results, combined with the inference output of the improved PaLI architecture large model, the bridge operation risk assessment results are generated. Based on the bridge operation risk assessment results, corresponding multi-level early warning information, early warning status reports and response plans are output.

[0023] In this embodiment, the multi-source heterogeneous data includes real-time monitoring time-series data, bridge structure design documents, historical maintenance records, and daily inspection report text. The preprocessing includes time alignment, missing data processing, outlier processing, filtering, text cleaning, OCR text recognition, and standardization processing.

[0024] In this embodiment, the construction of the context information and the generation of task prompts specifically include: Based on the preprocessed multi-source heterogeneous data, the data is divided into sensor monitoring time series data set, text data set and structural description data set according to the data source. The sensor monitoring time series data set includes time series data formed by sensor channels. The text data set includes inspection reports, historical maintenance records and structural design documents. The structural description data set includes bridge component list and component hierarchy relationship data. The sensor monitoring time series data set, text data set and structural description data set are uniformly encapsulated to construct a data entry set, and each data entry in the data entry set is bound with a bridge identifier, component identifier, acquisition time identifier and data source type identifier; The data entries corresponding to the sensor monitoring time series data set are divided into time windows to generate a time series segment set; The time window segmentation is specifically as follows: taking the time series of each sensor channel in the sensor monitoring time series data set as input, aligning it according to a unified sampling time axis, pre-setting the window length and window step size, and sliding to extract continuous sampling points according to the window step size from the start time to form a time series segment, generating non-overlapping time series segments in sequence; when there are missing sampling points, the missing values ​​are first interpolated or removed before segmentation, and each time series segment is bound to the corresponding bridge identifier, component identifier, and start and end time identifier of the time series segment; The data entries corresponding to the text data set are structured to generate a text structured field set, and the data entries corresponding to the structure description data set are standardized to generate a structure semantic field set. Both the text structured field set and the structure semantic field set are associated with the corresponding bridge identifier and component identifier. The structured processing specifically involves parsing the data entries corresponding to the text data set, splitting the text content according to the preset field template, extracting and generating the component identifier field, event type field, event time field, and event location field, and merging or retaining duplicate or conflicting fields under the same bridge identifier and component identifier according to the time order. The standardization process specifically involves: uniformly standardizing the data entries corresponding to the structural description dataset, the bridge component list and component hierarchy data, mapping component names, types and hierarchy relationships from different sources and in different forms of expression to predefined standard component identifiers and types, and organizing the hierarchical relationships between components according to unified hierarchy rules; merging or correcting duplicate or inconsistent descriptions of the same component to form a consistent structural description result, and generating a set of structural semantic fields representing the bridge component types and component hierarchy relationships; Using bridge identifiers and component identifiers as association keys, the time sequence fragment set, text structured field set, and structural semantic field set are matched and associated. For each component, a component context entry is formed. The component context entry includes the time sequence fragment corresponding to the component, the inspection and maintenance event field associated with the component, and the structural semantic field of the component. The context entries are organized in chronological order to form context information. Based on the context information, task prompt words are generated. The task prompt words include the prediction object, prediction index, prediction time span, prediction time resolution, and prediction output format.

[0025] In this embodiment, obtaining the predicted task requirement information specifically includes: The context information and task prompts are input into the improved PaLI architecture model, which includes a structural semantic injection module, a temporal-text joint encoding module, and a prediction task reasoning and generation module. The structural semantic injection module maps the set of structural semantic fields in the context information to generate a structural semantic representation, thus obtaining structurally enhanced context information. The temporal-text joint encoding module jointly encodes the structurally enhanced context information and task prompts, aligns the set of temporal segments with the set of text structured fields across modalities, and generates a unified multimodal representation. The prediction task reasoning and generation module introduces a task reasoning mechanism with degradation stage constraints to perform task reasoning on the unified multimodal representation and obtain prediction task requirement information. In the structural semantic injection module, a structural semantic representation is generated based on the set of structural semantic fields in the context information to obtain structural enhanced context information. The generation process involves mapping the bridge component type information, component hierarchical relationship information, and component association relationship information in the set of structural semantic fields to a structural semantic representation, and then fusing the structural semantic representation with the context information to generate structural enhanced context information. In the temporal-text joint encoding module, structurally enhanced contextual information and task prompt words are jointly encoded, and the temporal segment set and the text structured field set are cross-modal aligned to generate a unified multimodal representation; The joint feature encoding specifically involves: under the association relationship between bridge identifiers and component identifiers defined by the structural enhancement context information, matching the temporal representations in the temporal segment set with the text representations in the text structured field set according to the bridge identifiers and component identifiers to form matching pairs, and selecting the corresponding matching pairs according to the objects and indicators defined by the task prompt words, and fusing the selected matching pairs according to the preset fusion rules; In the prediction task reasoning and generation module, key information related to bridge components and monitoring indicators is extracted based on the unified multimodal representation to obtain candidate prediction objects and candidate prediction indicators. A task reasoning mechanism with degradation stage constraints is introduced to perform task reasoning on the unified multimodal representation to obtain degradation stage identifiers corresponding to bridge components. The candidate prediction objects and candidate prediction indicators are filtered to select prediction objects and prediction indicators that are consistent with the degradation stage identifiers. Based on the selected prediction objects and indicators, the corresponding prediction time span and prediction time resolution are obtained. The prediction objects, prediction indicators, prediction time span, prediction time resolution, and degradation stage identifiers are organized into prediction task requirement information. The extraction process involves locating the unified multimodal representation according to bridge and component identifiers to obtain multimodal information fragments corresponding to the components. In these information fragments, the indicator values ​​and time ranges are obtained according to a preset list of monitoring indicators and field rules to form monitoring indicator entries. The component name, defect description, and event time are obtained from structured text features to form component entries. The component entries are used as candidate prediction objects, and the monitoring indicator entries are used as candidate prediction indicators.

[0026] In this embodiment, the target time series prediction algorithm is obtained specifically by: The prediction task requirement information is input into the decision center of the monitoring and early warning intelligent agent, and the constraints for algorithm scheduling are obtained from the time series prediction algorithm library. The time series prediction algorithm library includes statistical models, machine learning models and deep learning models. The constraints include model category, input data type, prediction time span, prediction time resolution and degradation stage identifier. Candidate time series prediction algorithms are obtained from the time series prediction algorithm library. Based on the constraints of algorithm scheduling, the candidate time series prediction algorithms are matched and screened to obtain the target time series prediction algorithm that meets the prediction time span, prediction time resolution and degradation stage identifier. The screening process is as follows: For each candidate time series prediction algorithm in the time series prediction algorithm library, read the algorithm metadata. The algorithm metadata includes the prediction time span range, prediction time resolution set, and degradation stage identifier set of the candidate time series prediction algorithm. Using the algorithm scheduling constraints as screening rules, sequentially perform time resolution matching, time span matching, and degradation stage matching on the candidate time series prediction algorithms. The time resolution matching limits the prediction time resolution to the values ​​in the prediction time resolution set of the candidate time series prediction algorithm. The time span matching limits the prediction time span to the prediction time span range of the candidate time series prediction algorithm. The degradation stage matching limits the degradation stage identifiers to the degradation stage identifier set of the candidate time series prediction algorithm, thus obtaining the target time series prediction algorithm set.

[0027] In this embodiment, obtaining the predicted results of the bridge structural state specifically includes: Based on the prediction target and prediction indicators, corresponding monitoring data are collected from real-time monitoring time series data and formed into a target monitoring sequence in chronological order; The acquisition process is as follows: Based on the sensor identifier corresponding to the predicted object and the data field corresponding to the predicted index in the real-time monitoring time series data, the target data channel is obtained and the acquisition time range is set, with the current time as the end time and the corresponding historical duration configured in the input window as the start time; the monitoring values ​​of each monitoring time are continuously read from the target data channel according to the predicted time resolution, missing monitoring points are filled by interpolation of adjacent times, abnormal monitoring values ​​are corrected by threshold pruning, and each monitoring time and corresponding monitoring value are arranged in chronological order to form a target monitoring sequence; Based on the predicted temporal resolution, the target monitoring sequence is time-aligned to form an aligned monitoring sequence; The time alignment process specifically involves: establishing a unified time axis with the predicted time resolution as a fixed time interval, the unified time axis covering the time range corresponding to the target monitoring sequence; mapping each monitoring data in the target monitoring sequence to the corresponding time point on the unified time axis; retaining one monitoring data for multiple monitoring data falling at the same time point using the nearest time principle; filling in the missing monitoring data at time points on the unified time axis using linear interpolation of adjacent time points; and performing boundary constraint processing on the interpolated monitoring values ​​to obtain the aligned monitoring sequence arranged according to the unified time axis. Based on the input window configuration corresponding to the target time series prediction algorithm, the algorithm input sequence for prediction calculation is extracted from the aligned monitoring sequence; The extraction process is as follows: The input window configuration corresponding to the target time-series prediction algorithm is read from the decision-making center of the monitoring and early warning intelligent agent. The input window configuration includes the input window length and the input window step size. Using the current time as the termination time, the continuous monitoring points corresponding to the input window length are traced back in reverse chronological order in the aligned monitoring sequence and rearranged according to chronological order. When the input window step size is greater than the prediction time resolution, the aligned monitoring sequence is downsampled at equal intervals according to the input window step size and then truncated to form the algorithm input sequence. The prediction step number is obtained based on the prediction time span and prediction time resolution. The algorithm input sequence is then input into the target time series prediction algorithm to generate a prediction sequence according to the prediction step number and prediction time resolution. The predicted sequence is associated with the predicted object to form the predicted result of the bridge structure state. The predicted result of the bridge structure state includes the predicted object, the predicted index, the predicted time sequence and the corresponding predicted value sequence.

[0028] In this embodiment, the generation of the bridge operation risk assessment results specifically includes: The inference input is constructed by the predicted object, predicted index, predicted time sequence and corresponding predicted value sequence in the prediction results of the bridge structural state. The improved PaLI architecture large model performs joint inference on the bridge operation state and outputs the inference results of the bridge operation scenario, the inference results of the structural component features and the inference results of the degradation stage corresponding to the predicted object. Based on the inference results of structural component characteristics and degradation stage, the predicted value sequence is subjected to risk weighting processing to generate a risk score sequence that corresponds one-to-one with the predicted time sequence. The risk score sequence refers to the operational risk level of the predicted object at the corresponding predicted time. Based on the inference results of the bridge operation scenario, the risk score sequence is modified according to the scenario to generate the modified risk score sequence. The correction process specifically involves: reading the bridge operation scenario reasoning results in the decision-making center of the monitoring and early warning intelligent agent, mapping the bridge operation scenario reasoning results to the corresponding scenario correction parameters; and, based on the scenario correction parameters, correcting the risk scores corresponding to each prediction time in the risk score sequence one by one, adjusting the risk scores according to the bridge operation scenario based on the original values, and forming a corrected risk score sequence that corresponds one-to-one with the prediction time sequence. The revised risk score sequence is correlated and summarized with the prediction object, prediction index, prediction time sequence and corresponding prediction value sequence to generate the bridge operation risk assessment result.

[0029] In this embodiment, the output of the corresponding multi-level early warning information, early warning situation report, and response plan specifically includes: Based on the preset risk level classification rules, the risk score sequence in the risk assessment results is mapped to a level to generate a risk level sequence that corresponds one-to-one with the predicted time sequence. Based on the risk level sequence and the prediction time sequence, multi-level early warning information is generated. The multi-level early warning information includes the early warning level and time distribution of the predicted object at each prediction time in the prediction time sequence. The warning levels and risk scores corresponding to each prediction time are summarized in chronological order according to the prediction time sequence to form the situation data of risk changing over time, and a warning situation report is generated based on the situation data. Based on the warning level corresponding to each predicted time in the multi-level warning information, and according to the warning level, the corresponding response measures are obtained to form a response plan that corresponds one-to-one with the warning level.

[0030] Example 1: This embodiment uses a continuous beam highway bridge as an application scenario. Located on a heavy-duty traffic artery, the bridge endures long-term heavy vehicle loads and complex environmental influences, resulting in varying degrees of performance degradation in its substructure and main beam components. The bridge management unit has deployed sensors for strain, acceleration, deflection, and temperature at key bridge components and has accumulated bridge structural design documents, maintenance records, and periodic manual inspection reports generated over many years of operation. However, in actual operation and maintenance, the existing monitoring system primarily relies on threshold alarms and single-index analysis, making it difficult to comprehensively utilize multi-source data to make an overall judgment on the bridge's operational status. Furthermore, it lacks effective predictive capabilities for future structural changes, resulting in delayed warnings and a high false alarm rate.

[0031] In this application scenario, the data acquisition and preprocessing module of this invention first integrates real-time monitoring data of the bridge, structural design documents, historical maintenance records, and inspection report texts. The real-time monitoring data undergoes time alignment, missing value processing, and outlier correction according to a unified timeline. Textual data is processed through structured processing to extract fields such as component identifiers, event types, event times, and defect descriptions. Structural description data is standardized and mapped to unified component identifiers and component hierarchical relationships. Through these processes, a multi-source heterogeneous data set centered on bridge identifiers and component identifiers is constructed, providing a foundation for subsequent context construction.

[0032] Building upon this foundation, the system uses components as units, matching and associating time-series segments generated by sensor monitoring, inspection and maintenance event fields associated with the components, and structural semantic fields of the components to form component context entries. These are then organized chronologically to create contextual information about the bridge's operational status. Simultaneously, based on the operational and maintenance needs of bridge management personnel, the system automatically generates task prompts containing prediction objects, prediction indicators, prediction time spans, and prediction time resolutions.

[0033] Subsequently, the aforementioned contextual information and task prompts are input into the improved PaLI architecture multimodal large model. This model incorporates bridge component type and hierarchical relationship information in the structural semantic injection module, and performs cross-modal alignment between monitoring time-series segments and inspection texts in the time-series-text joint encoding module to generate a unified multimodal representation. In the prediction task reasoning and generation module, the model performs joint reasoning on the bridge operation scenario, structural component characteristics, and degradation stage, automatically identifying the current degradation stage of the component and generating matching prediction task requirement information, such as predicting the deflection change trend of key sections of the main beam over the next seven days.

[0034] Based on the prediction task requirements, the decision-making center of the monitoring and early warning intelligent agent automatically schedules target time-series prediction algorithms that match the current prediction task from the time-series prediction algorithm library. The algorithm library includes statistical models, machine learning models, and deep learning models, with different models suitable for different time scales, different degradation stages, and different monitoring indicators. The system matches and filters candidate algorithms based on the prediction time span, prediction time resolution, and degradation stage identifier, avoiding the mismatch problems caused by manually fixing the model selection.

[0035] During the prediction execution phase, the system collects corresponding monitoring data from real-time monitoring time-series data based on the prediction object and indicators, constructs the target monitoring sequence, and aligns it according to the prediction time resolution. Then, it extracts the algorithm input sequence according to the input window configuration of the target time-series prediction algorithm. The prediction algorithm outputs a sequence of predicted values ​​at multiple future prediction times, forming the prediction result of the bridge structural state.

[0036] During the risk assessment phase, the system inputs the prediction results into the improved PaLI architecture large model, combines the inference results of bridge operation scenarios, structural component characteristics, and degradation stages, assigns risk weights and corrects scenarios to the predicted value sequence, generates a corrected risk score sequence, and summarizes it with the predicted object, predicted indicators, and predicted time sequence to form the bridge operation risk assessment result.

[0037] Finally, based on the bridge operation risk assessment results, the system maps the risk score sequence to a risk level sequence, generates multi-level early warning information, and on this basis, forms an early warning situation report and response plan, which are uniformly displayed on the application terminal, providing bridge managers with intuitive and continuous risk evolution information and decision support.

[0038] Through the above applications, this invention effectively solves the problems of insufficient utilization of multi-source data, rigid selection of prediction models, and delayed risk warning in traditional bridge monitoring systems, and realizes predictive monitoring and intelligent early warning of bridge operation status.

[0039] Table 1. Comparison of the effectiveness of the multimodal large model-based monitoring and early warning system with traditional methods.

[0040] As shown in Table 1, traditional bridge monitoring and early warning methods mainly rely on sensor time-series data, failing to comprehensively utilize inspection text and structural semantic information, resulting in a one-sided understanding of the bridge's operational status. In contrast, the system of this invention significantly improves the completeness of monitoring information through multi-source heterogeneous data fusion. Regarding prediction accuracy, this invention, through an intelligently scheduled and adapted time-series prediction algorithm, reduces the average prediction error of key monitoring indicators from 12.8% of traditional methods to 5.3%, resulting in more stable and reliable prediction results.

[0041] Regarding the timeliness of early warnings, traditional methods typically trigger alarms only after indicators approach or exceed thresholds, resulting in limited lead time. In contrast, this invention, through prediction of future structural conditions and risk assessment, can identify potential risks 3 to 5 days in advance, providing bridge management units with ample intervention time. Furthermore, by introducing multimodal reasoning and scenario correction mechanisms, this invention effectively reduces the false alarm rate, avoiding frequent invalid alarms that interfere with maintenance work.

[0042] As can be seen from the above embodiments, the present invention has good applicability and significant technical effects in real bridge operation scenarios, and can provide reliable support for intelligent monitoring and safety early warning of highway bridges.

[0043] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.

Claims

1. A multi-modal large model-based highway bridge monitoring and early warning intelligent agent system, characterized in that, include: The data acquisition and preprocessing module is used to acquire and preprocess multi-source heterogeneous data of the target highway bridge. The context construction and prompt word generation module is used to construct context information of the target highway bridge's operating status based on preprocessed multi-source heterogeneous data, and generate task prompt words based on the context information; The multimodal reasoning module is used to input contextual information and task prompts into the improved PaLI architecture large model to reason about the bridge operation scenario, structural component characteristics and degradation stage, and generate prediction task requirement information. The algorithm scheduling module is used to automatically schedule target time series prediction algorithms that match the current prediction task from the time series prediction algorithm library based on the prediction task requirement information. The time series prediction module is used to predict real-time monitoring time series data based on the target time series prediction algorithm and generate prediction results of bridge structural status. The risk assessment module is used to generate bridge operation risk assessment results based on the prediction results and the inference output of the improved PaLI architecture large model. The early warning output module is used to output corresponding multi-level early warning information, early warning status reports and response plans based on the bridge operation risk assessment results.

2. The multi-modal large model-based highway bridge monitoring and early warning intelligent agent system according to claim 1, characterized in that, The modules are connected in the following way: Collect and preprocess multi-source heterogeneous data of the target highway bridge; Based on the preprocessed multi-source heterogeneous data, contextual information of the target highway bridge's operating status is constructed, and corresponding task prompt words are generated according to the contextual information. By inputting contextual information and task prompts into the improved PaLI architecture model, reasoning is performed on the bridge operation scenario, structural component characteristics and degradation stage to obtain the predicted task requirement information. Based on the prediction task requirements, the decision-making center of the monitoring and early warning intelligent agent automatically schedules the target time series prediction algorithm that matches the current prediction task from the time series prediction algorithm library; The bridge structural state is predicted by using the target time series prediction algorithm to predict the real-time monitoring time series data. Based on the prediction results, combined with the inference output of the improved PaLI architecture large model, the bridge operation risk assessment results are generated. Based on the bridge operation risk assessment results, corresponding multi-level early warning information, early warning status reports and response plans are output.

3. The multi-modal large model-based highway bridge monitoring and early warning intelligent agent system according to claim 2, characterized in that, The multi-source heterogeneous data includes real-time monitoring time-series data, bridge structure design documents, historical maintenance records, and daily inspection report texts. The preprocessing includes time alignment, missing data handling, outlier handling, filtering, text cleaning, OCR text recognition, and standardization processing.

4. The multi-modal large model-based highway bridge monitoring and early warning intelligent agent system according to claim 2, characterized in that, The construction of the context information and the generation of task prompts specifically include: Based on the preprocessed multi-source heterogeneous data, the data is divided into sensor monitoring time series data set, text data set and structural description data set according to the data source. The sensor monitoring time series data set includes time series data formed by sensor channels. The text data set includes inspection reports, historical maintenance records and structural design documents. The structural description data set includes bridge component list and component hierarchy relationship data. The sensor monitoring time series data set, text data set and structural description data set are uniformly encapsulated to construct a data entry set, and each data entry in the data entry set is bound with a bridge identifier, component identifier, acquisition time identifier and data source type identifier; The data entries corresponding to the sensor monitoring time series data set are divided into time windows to generate a time series segment set; The data entries corresponding to the text data set are structured to generate a text structured field set, and the data entries corresponding to the structure description data set are standardized to generate a structure semantic field set. Both the text structured field set and the structure semantic field set are associated with the corresponding bridge identifier and component identifier. Using bridge identifiers and component identifiers as association keys, the time sequence fragment set, text structured field set, and structural semantic field set are matched and associated. For each component, a component context entry is formed. The component context entry includes the time sequence fragment corresponding to the component, the inspection and maintenance event field associated with the component, and the structural semantic field of the component. The context entries are organized in chronological order to form context information. Based on the context information, task prompt words are generated. The task prompt words include the prediction object, prediction index, prediction time span, prediction time resolution, and prediction output format.

5. The intelligent agent system for monitoring and early warning of highway bridges based on a multimodal large model according to claim 2, characterized in that, The acquisition of the prediction task requirement information specifically includes: The context information and task prompts are input into the improved PaLI architecture model, which includes a structural semantic injection module, a temporal-text joint encoding module, and a prediction task reasoning and generation module. The structural semantic injection module maps the set of structural semantic fields in the context information to generate a structural semantic representation, thus obtaining structurally enhanced context information. The temporal-text joint encoding module jointly encodes the structurally enhanced context information and task prompts, aligns the set of temporal segments with the set of text structured fields across modalities, and generates a unified multimodal representation. The prediction task reasoning and generation module introduces a task reasoning mechanism with degradation stage constraints to perform task reasoning on the unified multimodal representation and obtain prediction task requirement information. In the structural semantic injection module, a structural semantic representation is generated based on the set of structural semantic fields in the context information to obtain structural enhanced context information. The generation process involves mapping the bridge component type information, component hierarchical relationship information, and component association relationship information in the set of structural semantic fields to a structural semantic representation, and then fusing the structural semantic representation with the context information to generate structural enhanced context information. In the temporal-text joint encoding module, structurally enhanced contextual information and task prompt words are jointly encoded, and the temporal segment set and the text structured field set are cross-modal aligned to generate a unified multimodal representation; In the prediction task reasoning and generation module, key information related to bridge components and monitoring indicators is extracted based on the unified multimodal representation to obtain candidate prediction objects and candidate prediction indicators. A task reasoning mechanism with degradation stage constraints is introduced to perform task reasoning on the unified multimodal representation to obtain degradation stage identifiers corresponding to bridge components. The candidate prediction objects and candidate prediction indicators are screened, and prediction objects and prediction indicators consistent with the degradation stage identifiers are selected. Based on the selected prediction objects and indicators, the corresponding prediction time span and prediction time resolution are obtained. The prediction objects, prediction indicators, prediction time span, prediction time resolution, and degradation stage identifiers are organized into prediction task requirement information.

6. The intelligent agent system for monitoring and early warning of highway bridges based on a multimodal large model according to claim 2, characterized in that, The specific steps involved in obtaining the target time series prediction algorithm are as follows: The prediction task requirement information is input into the decision center of the monitoring and early warning intelligent agent, and the constraints for algorithm scheduling are obtained from the time series prediction algorithm library. The time series prediction algorithm library includes statistical models, machine learning models and deep learning models. The constraints include model category, input data type, prediction time span, prediction time resolution and degradation stage identifier. Candidate time series prediction algorithms are obtained from the time series prediction algorithm library. Based on the constraints of algorithm scheduling, the candidate time series prediction algorithms are matched and selected to obtain the target time series prediction algorithm that meets the prediction time span, prediction time resolution and degradation stage identifier.

7. The intelligent agent system for monitoring and early warning of highway bridges based on a multimodal large model according to claim 2, characterized in that, The specific methods for obtaining the predicted bridge structural state include: Based on the prediction target and prediction indicators, corresponding monitoring data are collected from real-time monitoring time series data and formed into a target monitoring sequence in chronological order; Based on the predicted temporal resolution, the target monitoring sequence is time-aligned to form an aligned monitoring sequence; Based on the input window configuration corresponding to the target time series prediction algorithm, the algorithm input sequence for prediction calculation is extracted from the aligned monitoring sequence; The prediction step number is obtained based on the prediction time span and prediction time resolution. The algorithm input sequence is then input into the target time series prediction algorithm to generate a prediction sequence according to the prediction step number and prediction time resolution. The predicted sequence is associated with the predicted object to form the predicted result of the bridge structure state. The predicted result of the bridge structure state includes the predicted object, the predicted index, the predicted time sequence and the corresponding predicted value sequence.

8. The intelligent agent system for monitoring and early warning of highway bridges based on a multimodal large model according to claim 2, characterized in that, The generation of the bridge operation risk assessment results specifically includes: The inference input is constructed by the predicted object, predicted index, predicted time sequence and corresponding predicted value sequence in the prediction results of the bridge structural state. The improved PaLI architecture large model performs joint inference on the bridge operation state and outputs the inference results of the bridge operation scenario, the inference results of the structural component features and the inference results of the degradation stage corresponding to the predicted object. Based on the inference results of structural component characteristics and degradation stage, the predicted value sequence is subjected to risk weighting processing to generate a risk score sequence that corresponds one-to-one with the predicted time sequence. The risk score sequence refers to the operational risk level of the predicted object at the corresponding predicted time. Based on the inference results of the bridge operation scenario, the risk score sequence is modified according to the scenario to generate the modified risk score sequence. The revised risk score sequence is correlated and summarized with the prediction object, prediction index, prediction time sequence and corresponding prediction value sequence to generate the bridge operation risk assessment result.

9. The intelligent agent system for monitoring and early warning of highway bridges based on a multimodal large model according to claim 2, characterized in that, The output of the corresponding multi-level early warning information, early warning situation report, and response plan specifically includes: Based on the preset risk level classification rules, the risk score sequence in the risk assessment results is mapped to a level to generate a risk level sequence that corresponds one-to-one with the predicted time sequence. Based on the risk level sequence and the prediction time sequence, multi-level early warning information is generated. The multi-level early warning information includes the early warning level and time distribution of the predicted object at each prediction time in the prediction time sequence. The warning levels and risk scores corresponding to each prediction time are summarized in chronological order according to the prediction time sequence to form the situation data of risk changing over time, and a warning situation report is generated based on the situation data. Based on the warning level corresponding to each predicted time in the multi-level warning information, and according to the warning level, the corresponding response measures are obtained to form a response plan that corresponds one-to-one with the warning level.