Automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power

By acquiring multi-source heterogeneous data and evaluating neural network models, and dynamically adjusting control strategies, the stability problem of the control system during the casting of coastal wind power foundations was solved, and robust operation was achieved in high salt spray and high humidity environments.

CN122018333BActive Publication Date: 2026-06-30SHANDONG CENTURY XINYUAN CONSTR TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANDONG CENTURY XINYUAN CONSTR TECH CO LTD
Filing Date
2026-04-10
Publication Date
2026-06-30

Smart Images

  • Figure CN122018333B_ABST
    Figure CN122018333B_ABST
Patent Text Reader

Abstract

This invention relates to the fields of industrial automation control and building material preparation, specifically an automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power. It includes units for multi-source heterogeneous data acquisition, state assessment and risk quantification, dynamic risk hedging decision-making, and system confidence repair and closed-loop control. The system combines sensor data streams with coastal environmental disturbance characteristics, utilizes neural networks to calculate state assessment entropy, and predicts the safe time window before the failure boundary of the cold joint during pouring. Its core is based on comparing this window with a dangerous threshold, dynamically switching between nonlinear optimal or suboptimal robust control laws to adjust the water addition and stirring rate, and achieving closed-loop repair of the system by recalculating the assessment entropy. This invention overcomes the shortcomings of traditional systems that lack multi-source heterogeneous data fusion assessment under microclimate changes, accurately predicts the safe time window, and effectively avoids control instability and cold joint formation caused by environmental changes and data conflicts.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of industrial automation control and building material preparation, specifically to an automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power. Background Technology

[0002] In the continuous casting of offshore wind power foundations along the coast, the preparation of environmentally friendly lightweight concrete is a key link in the continuous casting of offshore wind power foundations. Because the site is in a high salt spray and high humidity environment and the casting process cannot be interrupted, traditional control systems mostly rely on static optimization of a single parameter and lack the fusion evaluation of multi-source heterogeneous sensor data under microclimate changes, which can easily lead to control instability and cause cold joints in the casting.

[0003] In existing technologies, although some systems attempt conventional closed-loop regulation, they generally suffer from inaccurate assessment of control state confidence and insufficient intelligence in dealing with data conflicts. At the same time, existing logic ignores the spatiotemporal misalignment of sensors caused by sudden environmental changes, making the control prone to failure under complex disturbances. In addition, when facing failure thresholds or manual intervention, there is a lack of dynamic degradation and closed-loop repair mechanisms, making it difficult to adapt to the robust operation requirements of continuous pouring under high disturbance conditions.

[0004] Therefore, how to provide an automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power is a problem that urgently needs to be solved by those skilled in the art. Summary of the Invention

[0005] To address the aforementioned technical problems, this invention provides an automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power, applied to a concrete preparation platform including a sensor array and on-site actuators. Specifically, the technical solution of this invention includes:

[0006] Multi-source heterogeneous data acquisition unit: Acquires multi-source sensor data streams containing moisture content of light solid waste raw materials, mixer operating status, and coastal environmental temperature and humidity through the sensor group; extracts coastal environmental disturbance characteristics from the multi-source sensor data streams.

[0007] State assessment and risk quantification unit: Based on the multi-source sensor data stream and the coastal environmental disturbance characteristics, construct a control state assessment model based on a neural network; calculate the state assessment entropy through the control state assessment model according to the time series; and predict the safe time window of the current control system from the preset failure boundary of the cold joint in the casting process based on the state assessment entropy.

[0008] Dynamic risk hedging decision unit: compares the safe time window with a preset danger threshold; if the safe time window is greater than or equal to the danger threshold, activates the nonlinear optimal control law to generate a first control command for adjusting the water addition or stirring rate; if the safe time window is less than the danger threshold, triggers the control degradation mode and activates the preset suboptimal robust control law to generate a second control command for adjusting the water addition or stirring rate.

[0009] System confidence repair and closed-loop unit: sends the first control command or the second control command to the field actuator; collects the feedback status of the field actuator and recalculates the state evaluation entropy; if the recalculated state evaluation entropy converges to the preset safe interval, the nonlinear optimal control law is restored; if the recalculated state evaluation entropy does not converge to the safe interval, the suboptimal robust control law is maintained and the control state evaluation model is updated.

[0010] Preferably, the method for extracting coastal environmental disturbance features from the multi-source sensor data stream includes:

[0011] Identify temperature and humidity abrupt change signals in the multi-source sensor data stream;

[0012] Acquire spatiotemporal misalignment data of heterogeneous sensors from the multi-source sensor data stream;

[0013] The temperature and humidity abrupt change signal is fused with the spatiotemporal misalignment data from the heterogeneous sensor to generate the coastal environmental disturbance characteristics.

[0014] Preferably, the method for calculating the state evaluation entropy through the control state evaluation model includes:

[0015] The multi-source sensor data stream is mapped to a unified dimensionless feature space to generate feature vectors, a preset reference consistency vector is obtained, and the feature vector numerical deviation rate in the dimensionless feature space is calculated based on the difference between the feature vector and the reference consistency vector.

[0016] When the deviation rate of the feature vector is less than the preset deviation threshold, the data is determined to be consistent. After converting each feature value in the feature vector into a weighted distribution, the information entropy of the feature vector in the dimensionless feature space is calculated as the basic entropy value, and the data consistency anomaly factor is set to zero.

[0017] When the deviation rate of the feature vector is greater than or equal to the deviation threshold, the data is determined to be contradictory. The information entropy of the feature vector in the dimensionless feature space is calculated as the basic entropy value, and the ratio of the deviation rate of the feature vector to the deviation threshold is extracted as the data consistency anomaly factor.

[0018] The state assessment entropy is obtained by linearly weighting and summing the basic entropy value, the data consistency anomaly factor, and the quantized value obtained by quantifying the coastal environmental disturbance characteristics; wherein, the state assessment entropy is used to characterize the degree of anomaly in the confidence of the control system commands.

[0019] Preferably, the method for predicting the safe time window of the current control system from the preset failure boundary of the cold joint based on the state evaluation entropy includes:

[0020] Obtain system operation history logs;

[0021] Based on the system's historical log, extract the time series of critical entropy values ​​corresponding to the failure boundary of the cold joint in the casting process.

[0022] The difference between the current state assessment entropy and the previous state assessment entropy recorded in the system operation history log is calculated and divided by the sampling time interval corresponding to the multi-source sensor data stream to obtain the entropy deterioration rate.

[0023] When the entropy deterioration rate is greater than a preset small fluctuation threshold, the difference between the critical entropy value corresponding to the critical entropy value in the critical entropy value time series and the state evaluation entropy at the current moment is divided by the entropy deterioration rate to calculate the remaining time to approach the failure boundary of the cold joint, which is used as the safe time window.

[0024] Preferably, the method of activating the nonlinear optimal control law to generate the first control command for adjusting the water addition or stirring rate includes:

[0025] Load a pre-defined multi-agent deep reinforcement learning model;

[0026] The multi-source sensor data stream is input into the multi-agent deep reinforcement learning model;

[0027] Based on the multi-agent deep reinforcement learning model, a nonlinear optimal solution is generated with the joint optimization objectives of minimizing the moisture content fluctuation compensation delay time and minimizing the energy consumption of the field actuator. The moisture content fluctuation compensation delay time refers to the period from when the multi-source sensor data stream detects a sudden change in the moisture content of the light solid waste raw material until the field actuator adjusts its output to bring the data items reflecting the moisture content of the light solid waste raw material in the multi-source sensor data stream back to the steady-state threshold band pre-calibrated based on historical stable operation data.

[0028] The nonlinear optimal solution is converted into the first control command.

[0029] Preferably, the trigger control degradation mode, which activates a preset suboptimal robust control law to generate a second control command for adjusting the water addition amount or stirring rate, includes the following methods:

[0030] Freeze the adaptive weights of the multi-agent deep reinforcement learning model;

[0031] Extract the conservative dead zone control parameters corresponding to the control degradation mode from the preset system configuration library;

[0032] Based on the conservative dead zone control parameters, the joint optimization objective is abandoned, and a suboptimal robust control law is constructed with minimizing the state evaluation entropy as the single objective.

[0033] The suboptimal robust control law is executed to generate the second control command; wherein the second control command reduces the response frequency to fluctuations in the multi-source sensor data stream.

[0034] Preferably, the method of collecting the feedback status of the field actuator and recalculating the status assessment entropy includes:

[0035] Monitor whether a manual overrun control signal is generated in the field actuator;

[0036] When no manual overtaking control signal is detected, the frequency of manual intervention is recorded as zero, and the mechanical execution feedback data generated by the field actuator is directly collected as the feedback status.

[0037] When the manual overtaking control signal is detected, the frequency of manual intervention is recorded, the control quantity contained in the manual overtaking control signal is extracted, and the control quantity is weighted and spliced ​​with the mechanical execution feedback data generated by the field actuator to generate the feedback state.

[0038] The feedback state is input into the control state evaluation model, and the updated state evaluation entropy is recalculated.

[0039] Preferably, if the recalculated state evaluation entropy converges to a preset safety interval, the method for restoring the nonlinear optimal control law includes:

[0040] Compare the updated state evaluation entropy with the upper and lower limits of the preset safety interval;

[0041] If the updated state evaluation entropy is greater than the lower limit of the preset safety interval and less than the upper limit of the preset safety interval, then the system confidence is determined to have recovered.

[0042] Unfreeze the adaptive weights of the multi-agent deep reinforcement learning model;

[0043] Reactivate the joint optimization objective and resume outputting the first control command.

[0044] Preferably, if the recalculated state evaluation entropy does not converge to the safe interval, the method of maintaining the suboptimal robust control law and updating the control state evaluation model includes:

[0045] If the updated state evaluation entropy is greater than or equal to the upper limit of the preset safety interval, or less than or equal to the lower limit of the preset safety interval, then the abnormal state of the system confidence is determined to continue.

[0046] Maintain the output of the second control command;

[0047] The product of the frequency of manual intervention and the data consistency anomaly factor is used as a penalty term to construct a penalty function;

[0048] The prediction error between the state evaluation entropy output by the control state evaluation model and the actual risk label dynamically calculated by the posterior calibration rule, combined with the penalty function, constitutes a loss function. The penalty function is then used for backpropagation to correct the evaluation weights of the control state evaluation model, thereby completing the update of the control state evaluation model.

[0049] Preferably, the coastal environmental disturbance characteristics are physical property jump characteristics caused by the reversal of coastal sea fog microclimate.

[0050] Compared with the prior art, the present invention has the following beneficial effects:

[0051] 1. This invention generates coastal environmental disturbance characteristics by extracting temperature and humidity abrupt change signals from multi-source sensor data streams and fusing them with spatiotemporally misaligned data from heterogeneous sensors, and then uses a neural network to calculate the state assessment entropy. This overcomes the shortcomings of traditional systems that lack multi-source heterogeneous data fusion assessment under microclimate abrupt changes, and can accurately predict the safe time window from the failure boundary of cold joints in casting, effectively avoiding control instability and cold joint formation caused by environmental abrupt changes and data conflicts.

[0052] 2. This invention uses a dynamic decision-making process that compares a safe time window with a dangerous threshold. When safe, a multi-agent deep reinforcement learning model is activated to generate a nonlinear optimal control law; when approaching danger, a control degradation mode is triggered, activating a suboptimal robust control law to reduce the response frequency. This solves the problem of existing systems lacking a dynamic degradation mechanism at critical failure points, avoids frequent oscillations of the actuator under high-risk conditions, and achieves stable and robust operation under uninterrupted constraints.

[0053] 3. This invention constructs a system confidence repair mechanism. By monitoring human overrun control signals, the human control quantity and mechanical feedback are weighted to generate a recalculated entropy value for the feedback state; when convergence fails, the evaluation model is updated with a penalty term consisting of the product of the frequency of human intervention and the data consistency anomaly factor. This approach compensates for the deficiency of existing logic in ignoring human intervention, enabling the model to self-correct evaluation weights during a crisis, thus improving its dynamic adaptability to complex on-site disturbances. Attached Figure Description

[0054] The present invention will be further explained below with reference to the accompanying drawings and embodiments:

[0055] Figure 1 This is a schematic diagram of the automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power according to the present invention. Detailed Implementation

[0056] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to specific embodiments.

[0057] like Figure 1 As shown, the automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power includes: a multi-source heterogeneous data acquisition unit: which acquires multi-source sensor data streams containing the moisture content of lightweight solid waste raw materials, the operating status of the mixer, and the temperature and humidity of the coastal environment through a sensor group; the heterogeneous characteristics of the multi-source sensor data stream refer to the inconsistencies in the physical sampling dimensions, time sampling frequency, and spatial deployment location of various sensors; and extracts the coastal environmental disturbance characteristics from the multi-source sensor data stream.

[0058] State assessment and risk quantification unit: Based on multi-source sensor data streams and coastal environmental disturbance characteristics, a control state assessment model based on neural networks is constructed; the state assessment entropy is calculated by the control state assessment model according to the time series.

[0059] Based on state assessment entropy, the safe time window between the current control system and the preset failure boundary of the cold joint in the pouring process is predicted. The failure boundary of the cold joint in the pouring process refers to the critical physical state in which the interface between the old and new concrete cannot be effectively fused due to control interruption or parameter deviation exceeding the preset physical tolerance limit, and a structural fault is about to occur. This state is mapped to the highest risk quantification benchmark set by the system in the automatic control system. The dynamic risk hedging decision unit compares the safe time window with the preset danger threshold.

[0060] If the safe time window is greater than or equal to the danger threshold, the nonlinear optimal control law is activated to generate the first control command for adjusting the water addition or stirring rate; if the safe time window is less than the danger threshold, the control degradation mode is triggered, and the preset suboptimal robust control law is activated to generate the second control command for adjusting the water addition or stirring rate; System confidence repair and closed-loop unit: sends the first or second control command to the field actuator; collects the feedback status of the field actuator and recalculates the state evaluation entropy;

[0061] If the recalculated state evaluation entropy converges to the preset safe interval, the nonlinear optimal control law is restored; if the recalculated state evaluation entropy does not converge to the safe interval, the suboptimal robust control law is maintained and the control state evaluation model is updated.

[0062] This embodiment provides an automatic control mechanism for the preparation of environmentally friendly lightweight concrete for coastal wind power. Specifically, the mechanism is deployed on the preparation platform of the continuous casting project of offshore wind power foundation. The platform is located in a coastal environment with high salt spray, high humidity and strong electromagnetic interference. The system control objective is not the static optimization of a single mixing parameter, but based on the process constraint that continuous casting cannot be interrupted, to ensure the global robust operation of the system under high-risk conditions approaching the failure boundary, and to avoid cold joints in the casting due to control instability.

[0063] The specific processing procedure is as follows: In the hardware architecture and logical function mapping relationship of the system, the system includes a sensor group, an edge controller, a field actuator and a host monitoring terminal, and the multi-source heterogeneous data acquisition unit, the status assessment and risk quantification unit, the dynamic risk hedging decision unit and the system confidence repair and closed-loop unit are all integrated and deployed in the edge controller as core control logic.

[0064] The sensor array includes at least a raw material moisture content sensor, a silo weighing sensor, a mixer speed sensor, a motor current sensor, an ambient temperature sensor, an ambient humidity sensor, and a manual operation panel for collecting manual control commands. Simultaneously, when the system detects a manual overrun control signal, it records the frequency of manual intervention, extracts the control quantity contained in the signal, and weights and concatenates the control quantity with the mechanical execution feedback data generated by the field actuator to generate a comprehensive feedback state. The edge controller periodically receives data streams from multiple sensors and aligns data from different sampling periods to a unified time axis.

[0065] For example, with a control cycle of 10 seconds, at a certain moment the data collected shows a moisture content of 7.2%, a mixer speed of 38 rpm, a motor current of 21 amps, an ambient temperature of 29°C, and an ambient humidity of 94%. If the humidity sensor has a sampling cycle of 5 seconds and the motor current sensor has a sampling cycle of 1 second, then the 1-second data is first averaged using a window, and then the 5-second data is updated to maintain the current state vector.

[0066] Based on this, the system extracts coastal environmental disturbance features from the unified state vector; these disturbance features are not isolated environmental parameters, but a comprehensive measure used to characterize the degree of impact of abrupt changes in the coastal environment on control reliability.

[0067] Subsequently, the edge controller inputs the multi-source sensor data streams along with the disturbance feature into the control state assessment model. This model can be implemented using a multi-layer feedforward neural network or a neural network with a time memory structure. Its input is a time-series segment formed by splicing together several recent cycles, and its output is the state assessment entropy at the current moment. Before the system is deployed online, the control state assessment model is pre-trained in a supervised manner using an operational dataset containing historical fluctuation data from multi-source sensors, corresponding manual intervention records, and equipment instability shutdown logs to establish an initial mapping relationship between the input features and the output state assessment entropy. Here, the state assessment entropy is used to characterize the degree of confidence abnormality of the current automatic control command when there is inconsistency in multi-source inputs, increased external disturbances, and deviations in execution results. The larger this value, the closer the system is to the control instability zone.

[0068] The system predicts a safe time window based on the temporal trend of state assessment entropy changes. The safe time window, calculated based on the deterioration rate of the current state assessment entropy, represents the remaining time before the preset failure boundary of the cold joint. For example, if the critical entropy corresponding to the preset cold joint failure boundary is 0.92, the current state assessment entropy is 0.68, and the entropy growth rate of the last two cycles is 0.008 per minute, then the remaining time is approximately... Minutes; the system will use this 30-minute period as the current safe time window;

[0069] The dynamic risk hedging decision unit further compares the safe time window with the danger threshold; the danger threshold can be preset to 20 minutes or 30 minutes based on the experience of continuous pouring process on site; if the safe time window is greater than or equal to the danger threshold, it means that the system still has room to continue to pursue multi-objective joint optimal control. At this time, the nonlinear optimal control law is activated and the first control command is output; the first control command is used to adjust the water addition or stirring rate, and two execution quantities can also be output at the same time.

[0070] If the safe time window is less than the danger threshold, it means that the system is rapidly approaching the risk of a pouring fault exceeding the allowable threshold. At this time, the search for the joint optimization objective within a time window shorter than the preset limit is abandoned. Instead, the control degradation mode is immediately triggered, the suboptimal robust control law is activated, and the second control command is output. The second control command is usually manifested as an adjustment trajectory with a lower change slope and action frequency than the first control command, in order to suppress command jitter and frequent oscillation of the actuator.

[0071] The system confidence repair and closed-loop unit receives the first control command or the second control command and sends it to the field actuators, such as the water valve driver and the stirring motor speed controller. After execution, the system collects the feedback status of the actuators, such as the actual valve opening, actual water volume, speed readback, current change, etc., and re-inputs them into the control status evaluation model to recalculate the status evaluation entropy.

[0072] If the entropy of the new round of state assessment enters the preset safe range, the nonlinear optimal control law is restored; if it still does not enter the safe range, the suboptimal robust control law is maintained, and the control state assessment model is updated according to the new feedback to adapt it to the current noise, time delay and human intervention conditions.

[0073] As an anomaly tolerance mechanism, when there is a missing sensor data packet, communication interruption, or obvious out-of-bounds value within a certain period, where obvious out-of-bounds value refers to the current value collected by the sensor exceeding the limit of the physical range of this type of sensor, or the rate of change of the value in adjacent periods exceeding the preset maximum slope of reasonable physical change, the edge controller first triggers the data quality flag bit; if the duration of the missing data packet does not exceed the preset duration, such as not exceeding two control periods, the most recent valid value is used for compensation, and the weight of this channel in the input of this period is reduced.

[0074] If the duration of the missing data exceeds the preset duration, the safety time window will be compressed directly, for example, by multiplying the original calculation result by a preset compression coefficient of 0.7 to make the decision logic more conservative. If the feedback status of the actuator is unavailable, the system will keep the current output unchanged for one cycle, and at the same time add a preset penalty to the state evaluation entropy to avoid continuing to output adjustment instructions with the change gradient exceeding the preset limit under the state of missing observation.

[0075] For example, after the offshore wind turbine foundation was continuously poured for 6 hours at night, the sea fog intensified rapidly, and the ambient humidity jumped from 82% to 96% within 2 minutes, while the flow sensor still showed that the water supply was normal. After the system received this set of inconsistent data, the state assessment entropy increased from 0.51 to 0.73, and the predicted safe time window shortened from 58 minutes to 24 minutes.

[0076] Since the current value is still above the 20-minute threshold, the system temporarily maintains the nonlinear optimal control law, making only minor adjustments to the water supply. Five minutes later, the operator noticed frequent valve movements and initiated local manual pulse control, causing the state assessment entropy to rise further to 0.81, reducing the safe time window to 14 minutes. The system immediately entered a control degradation mode, reducing the frequency of valve opening and closing and adopting a conservative control method with a wider dead zone. This caused the state assessment entropy to drop back to 0.62 within the next 15 minutes, re-entering the safe zone before resuming normal optimized control.

[0077] The purpose of this step is to establish a closed-loop control framework with the continuous controllability of the system as its core, so that the control logic can be transformed from simply pursuing the minimum error of water addition or rotation speed to a dynamic control system that simultaneously manages risk boundaries, command credibility and on-site executability, thereby achieving global robust operation in long-term continuous pouring scenarios.

[0078] Methods for extracting coastal environmental disturbance features from multi-source sensor data streams include: identifying temperature and humidity abrupt change signals in the multi-source sensor data stream; acquiring spatiotemporal misalignment data from heterogeneous sensors in the multi-source sensor data stream; and fusing the temperature and humidity abrupt change signals with the spatiotemporal misalignment data from heterogeneous sensors to generate coastal environmental disturbance features.

[0079] This embodiment provides a step for extracting coastal environmental disturbance features. Specifically, in the aforementioned overall control process, if environmental quantities such as temperature and humidity are used directly as ordinary inputs, it is difficult to highlight the significant interference features and negative impacts of sudden changes in coastal microclimate on the reliability of the control state. Especially under the sea fog reversal condition, environmental changes are not slow drifts, but cause asynchronous responses of multiple types of sensors in a short period of time. Therefore, it is necessary to extract and fuse the sudden changes and misalignment features synchronously to form specialized disturbance features.

[0080] The specific processing procedure is as follows: The system first identifies temperature and humidity change signals within a sliding time window; the most recent 6 control cycles can be used as a window to calculate the change amplitude and rate of change of the temperature sequence and humidity sequence respectively; with a specific quantitative deduction example, if the humidity sequence within the window is 80%, 81%, 82%, 90%, 94%, 95%, then the adjacent differences are 1, 1, 8, 4, 1;

[0081] If the preset humidity mutation threshold is 5, then 82% to 90% of the data will be identified as mutation points; similarly, if the temperature sequence is formed by 27.5℃, 27.3℃, 27.2℃, 26.4℃, 26.1℃, and 26.0℃, then 27.2℃ to 26.4℃ can be identified as temperature mutation points; the system can further encode the mutation amplitude, duration, and direction into a set of perturbation components;

[0082] Next, the system extracts spatiotemporal misalignment data from heterogeneous sensors; spatiotemporal misalignment here is used to describe the phenomenon that different types of sensors have inconsistent response times and corresponding location results after the same environmental event occurs; time misalignment can be represented by the deviation of the peak occurrence time of multi-channel data;

[0083] For example, the humidity change occurs in the 4th cycle, but the moisture content of the raw material only increases significantly in the 6th cycle, resulting in a time misalignment of 2 cycles. Spatial misalignment can be represented by the difference between sensors installed at different locations. For example, if the humidity sensor above the silo shows 95%, while the humidity sensor near the control cabinet shows only 86%, the location difference ratio can be calculated.

[0084] In an exemplary simulation, if the normalized values ​​of the same period for three sensor nodes A, B, and C are 0.92, 0.61, and 0.89, respectively, then if the difference between A and B, and between B and C, is greater than the preset difference threshold, the system can determine that the local environmental disturbance near B is stronger, thereby forming a spatial misalignment description quantity.

[0085] Subsequently, the system fuses the temperature and humidity abrupt change signals with spatiotemporal misalignment data from heterogeneous sensors. This fusion can be achieved using weighted splicing, gated fusion, or feature mapping. In a simplified example, the temperature and humidity abrupt change vector can be denoted as [0.8, 0.6], and the spatiotemporal misalignment vector as [0.5, 0.7, 0.4]. After unified normalization, these vectors are spliced ​​to [0.8, 0.6, 0.5, 0.7, 0.4], and then compressed into a single perturbation feature value, such as 0.76, through a mapping network. The higher this value, the more likely the current coastal environmental event is to cause inconsistencies between sensors and shifts in state judgment.

[0086] As an anomaly tolerance mechanism, if only slow changes in temperature and humidity are detected without obvious abrupt changes, the abrupt change component is output according to a preset low confidence interval, such as being limited to between 0 and 0.2.

[0087] If a single type of sensor hardware failure causes misalignment calculation distortion, the system will remove the faulty channel from the spatiotemporal misalignment calculation process through the device health identifier and recalculate it using the remaining normal channels. If the temperature and humidity sensor itself fails, the disturbance feature will not be directly set to zero, but will be estimated using the statistical value of the most recent stable window and the indirect offset of other channels to prevent the disturbance risk from being missed due to physical failure of the main sensor under extreme environmental conditions.

[0088] For example, in the aforementioned scenario of continuous nighttime pouring of offshore wind power foundations, after sea fog entered the material silo area, the humidity sensor above the silo suddenly increased from 83% to 93% in the 120th cycle, while the humidity sensor near the control cabinet only increased from 79% to 84%; at the same time, the raw material moisture content sensor only increased from 6.9% to 8.1% after two cycles; the system therefore extracted the composite characteristics of large humidity fluctuation amplitude, significant humidity differences at different locations, and lag in moisture content response, and mapped them to high disturbance characteristic values; the subsequent state assessment model is thus improved to enhance its sensitivity to the risk of inconsistency in current observations;

[0089] The purpose of this step is to separate the environmental impact caused by coastal microclimate from the original environmental quantities and express it as a disturbance characterization that is closer to the control risk, so as to achieve early warning for subsequent state assessment and control downgrade decision-making.

[0090] The method of calculating the state assessment entropy through the control state assessment model includes: mapping the multi-source sensor data stream of this embodiment to a unified dimensionless feature space, generating feature vectors, obtaining a preset reference consistency vector, and calculating the feature vector numerical deviation rate in the dimensionless feature space based on the difference between the feature vector and the reference consistency vector.

[0091] When the deviation rate of the eigenvector values ​​is less than the preset deviation threshold, the data is determined to be consistent. After converting each eigenvalue in the eigenvector into a weighted distribution, the information entropy of the eigenvector in the dimensionless eigenspace is calculated as the basic entropy value, and the data consistency anomaly factor is set to zero. When the deviation rate of the eigenvector values ​​is greater than or equal to the deviation threshold, the information entropy of the eigenvector in the dimensionless eigenspace is calculated as the basic entropy value. The basic entropy value, the data consistency anomaly factor, and the quantized value obtained by quantifying the coastal environmental disturbance characteristics are linearly weighted and summed to obtain the state assessment entropy. The state assessment entropy is used to characterize the degree of anomaly in the confidence of the control system commands.

[0092] This embodiment provides a calculation step for state assessment entropy. Specifically, although the coastal environmental disturbance characteristics have been extracted in the previous scheme, it is still impossible to distinguish between two situations: environmental deterioration with consistent sensor data and environmental deterioration with conflicting multi-source data. The former is usually still controllable, while the latter is more likely to induce abnormal command jumps. Therefore, this embodiment further introduces a unified dimensionless feature space, numerical deviation rate, and data consistency anomaly factor to characterize the impact of conflicting multi-source observations on control reliability.

[0093] The specific processing steps are as follows: The system first maps data from different sources to a unified dimensionless feature space. Before mapping, data such as raw material moisture content, stirring speed, motor current, valve opening, and ambient humidity have different dimensions and cannot be directly compared. During mapping, preset upper and lower limits can be used for linear normalization. For example, moisture content of 0% to 12% is mapped to 0 to 1, stirring speed of 0 to 60 rpm is mapped to 0 to 1, and ambient humidity of 40% to 100% is mapped to 0 to 1. If a certain period collects data with a moisture content of 8.4%, a stirring speed of 36 rpm, and a humidity of 94%, then their dimensionless values ​​can be recorded as 0.70, 0.60, and 0.90, respectively. Multiple sensor channels together form a unified feature vector.

[0094] Then, the feature vector numerical deviation rate is calculated; the deviation rate is used to characterize whether there is abnormal inconsistency between different channels; an exemplary calculation method is to determine the expected correlation of each channel under the current operating conditions based on the historical model. The system pre-trains a clustering model using the historical dataset of multi-source sensors under no abnormal conditions, extracts the cluster center feature vectors under each typical environmental condition, and stores them in the system configuration library as the reference consistency vectors under the corresponding operating conditions; during online evaluation, the corresponding reference consistency vectors are directly matched and retrieved based on the currently collected coastal environmental temperature and humidity.

[0095] For example, under high humidity conditions, there should be a relatively gradual change between the moisture content of raw materials and the water demand. If the dimensionless value of the ambient humidity is 0.90 in the same period, while the flow feedback remains at 0.35 and the moisture content suddenly rises to 0.83, it indicates that the observed combination deviates significantly from the historical correlation. For ease of explanation, we can assume that the reference consistency vector output by the model is [0.72, 0.58, 0.86] and the actual vector is [0.83, 0.35, 0.90]. Then the absolute values ​​of the deviations in each dimension are 0.11, 0.23, and 0.04, respectively, and the average is 0.126. If we divide this by the reference mean of 0.72, we get a deviation rate of approximately 0.175. If the preset deviation threshold is 0.15, then the current data is judged to have a significant conflict.

[0096] When the deviation rate is less than the preset threshold, the system determines that the data is consistent, calculates only the basic entropy value, and sets the data consistency anomaly factor to zero; the basic entropy value can be obtained based on the degree of dispersion of the dimensionless feature vector in the current window;

[0097] Specifically, to meet the requirement of non-negativity in the calculation of information entropy for the distribution of feature values, the system directly calculates the proportion based on the aforementioned dimensionless feature values ​​mapped to the interval between 0 and 1; let the proportion of each feature value in the total sum of the dimensionless feature vectors extracted in the current window be... Basic entropy The calculation is based on the information entropy formula, which is as follows:

[0098]

[0099] in, The number of feature dimensions; for example, if the variance of the mapping vector fluctuation in the last 4 periods is less than the preset threshold, and the feature weight distribution of each channel tends to be consistent, then the information entropy calculation result is low, and the basic entropy value is 0.32; since the deviation rate does not exceed the threshold, the data consistency anomaly factor is directly 0, indicating that the system has not encountered serious observation conflicts.

[0100] When the deviation rate is greater than or equal to a preset threshold, the system determines that the data is contradictory. It first calculates the base entropy value, then extracts the ratio of the deviation rate to the threshold as a data consistency anomaly factor. Taking a deviation rate of 0.175 and a threshold of 0.15 as an example, this factor could be: The data consistency anomaly factor is used to amplify the risk of command failure caused by conflicts between observations from multiple sensors, thereby effectively distinguishing between data source conflicts and ordinary signal fluctuations in a single dimension.

[0101] The system inputs multi-source sensor data streams into the control state assessment model. The model's single-layer linear network linearly sums the base entropy value, data consistency anomaly factor, and quantized values ​​of coastal environmental disturbance characteristics to obtain the state assessment entropy. For example, if the base entropy value is 0.41, the data consistency anomaly factor is 1.17, and the disturbance quantization value is 0.76, with weights of 0.4, 0.35, and 0.25 respectively, then the state assessment entropy can be denoted as:

[0102]

[0103] This value is significantly higher than the 0.3 to 0.5 range under stable operating conditions, indicating that the reliability of the control system commands has entered a dangerous deterioration stage.

[0104] As an anomaly tolerance mechanism, if the upper and lower limits of a certain channel are the same during the normalization process, resulting in the inability to perform calculations, the system directly uses a fixed intermediate value of 0.5 to replace the dimensionless characteristic value of that channel and reduces its confidence weight in subsequent evaluations.

[0105] If the deviation rate is abnormally amplified but caused by a single damaged sensor, the system can first remove the channel based on the equipment health status and then recalculate the deviation rate; if it is still higher than the threshold after removal, the high-risk judgment is maintained; if it drops significantly after removal, the hardware failure event of the channel is recorded and an equipment maintenance work order is generated, thereby avoiding the accidental triggering of global control degradation due to the failure of a single sensor.

[0106] For example, in a scenario where continuous pouring continues for 7 hours, after the sudden intrusion of sea fog, the ambient humidity mapping value rapidly rises to 0.92, and the raw material moisture content mapping value rises to 0.81, but the flow feedback remains at 0.34, while the valve opening repeatedly fluctuates around 0.60. The system maps these quantities to a unified space and calculates the deviation rate, which is higher than the threshold. Therefore, it not only generates a high basic entropy value but also generates a data consistency anomaly factor greater than 1. The two, together with the disturbance characteristics, cause the state assessment entropy to jump from 0.49 to 0.78. The on-site monitoring terminal simultaneously displays the increase in this value, indicating that the reliability of the current control command has significantly decreased.

[0107] The purpose of this step is to stratify and quantify ordinary fluctuations, environmental disturbances, and multi-source conflicts, thereby avoiding the use of a single error index for control judgment and achieving a more detailed expression of the degree of anomaly in control reliability.

[0108] Based on state assessment entropy, the methods for predicting the safe time window between the current control system and the preset failure boundary of the cold joint include: obtaining the system operation history log; and extracting the time series of critical entropy values ​​corresponding to the failure boundary of the cold joint based on the system operation history log.

[0109] The difference between the current state assessment entropy and the previous state assessment entropy recorded in the system operation history log is calculated and divided by the sampling time interval corresponding to the multi-source sensor data stream to obtain the entropy deterioration rate. When the entropy deterioration rate is greater than the preset small fluctuation threshold, the difference between the critical entropy value corresponding to the critical entropy value in the critical entropy value time series and the current state assessment entropy is divided by the entropy deterioration rate to calculate the remaining time to approach the failure boundary of the cold joint, which is used as the safety time window.

[0110] This embodiment provides a prediction step for a safe time window. Specifically, the state assessment entropy can be obtained in the previous scheme. If only the current state assessment entropy value is output, it is difficult to intuitively represent the remaining available time of the system before the failure boundary of the cold joint. Especially in continuous pouring tasks, the evaluation weight of the time dimension is higher than that of static indicators. Therefore, this embodiment constructs a time series of critical entropy values ​​through historical logs and converts the current deterioration rate into the remaining controllable time.

[0111] The specific processing procedure is as follows: The system first loads the preset operation history log; the log contains at least the control cycle number, state assessment entropy, manual intervention record, actuator feedback, and whether the cold joint risk boundary was approached or reached in multiple consecutive pouring tasks in the past; the log can be stored on the local industrial server or periodically distributed by the upper monitoring platform; the system extracts the time series of critical entropy values ​​corresponding to the cold joint failure boundary based on past samples;

[0112] The time series here is not a single fixed value, but takes into account the different tolerances for control continuity at different pouring stages; for example, the critical entropy values ​​can be different in the early stage of pouring, the middle stage of stabilization and the transition period of batch change; for simplification, assuming that the critical entropy values ​​in a certain section are 0.90, 0.91, 0.92 and 0.92 respectively, the system selects the critical value according to the corresponding position in the current cycle;

[0113] The system calculates the difference between the current state assessment entropy and the previous state assessment entropy, divides it by the sampling time interval corresponding to the multi-source sensor data stream, and obtains the entropy deterioration rate. For example, if the entropy value in the previous period was 0.70, the current period is 0.76, and the sampling interval is 5 minutes, then the deterioration rate is... Every minute; if the critical entropy value corresponding to the current section is 0.92, then the remaining tolerable entropy difference is 0.16; the system then divides 0.16 by 0.012 to get approximately 13.3 minutes, which is the current safe time window;

[0114] If the current entropy value does not deteriorate, but remains flat or decreases, the system can treat the deterioration rate as a preset positive lower limit threshold approaching zero or enter a slow-release state. For example, if the current entropy value decreases from 0.76 to 0.72, the difference is negative. In this case, direct division would yield meaningless results, so the system can change the judgment to a temporary extension of the safe time window and output a mark greater than the preset upper limit, such as 90 minutes or more. If the deterioration rate is less than the preset small fluctuation threshold, such as below 0.001, it will not be treated as infinitely large, but will be uniformly truncated to the upper limit of engineering visualization, such as 120 minutes, to avoid abnormal interface display.

[0115] As a fault-tolerance mechanism, if the historical log lacks the critical entropy value time series for the corresponding work section, the average value of adjacent work sections or the global default critical value is used instead; if the current period sampling interval is abnormal, such as communication delay causing the period to lengthen, the actual timestamp difference is used instead of the fixed period value to calculate the rate; if the rate fluctuation amplitude of multiple consecutive periods exceeds the preset variance threshold, the system can introduce a short window average rate, such as the average of the most recent 3 periods, to prevent frequent jumps in the safety time window due to instantaneous jitter.

[0116] For example, during the continuous pouring of offshore wind turbine foundations, the system detected a state assessment entropy of 0.68 in the 150th cycle and rose to 0.74 in the 151st cycle, with a 5-minute interval between the two. Since the current phase is a transitional phase for batch change, the critical entropy value corresponding to this phase in the historical log is 0.90. Based on this, the system calculates the deterioration rate as 0.012 per minute, with a safe time window of approximately 13.3 minutes. The monitoring terminal then displays the remaining safe time of 13 minutes in the form of a countdown, and the control logic simultaneously shifts from aggressive optimization to risk priority, thereby providing on-site monitoring with an intuitive time warning of the approaching continuous pouring fault risk.

[0117] The purpose of this step is to transform the abstract entropy risk into a time window indicator that engineers can understand and act upon, thereby achieving a unified scale expression between control algorithms and on-site decision-making.

[0118] The method for activating the nonlinear optimal control law to generate the first control command for adjusting the water addition or stirring rate includes: loading a preset multi-agent deep reinforcement learning model; inputting multi-source sensor data streams into the multi-agent deep reinforcement learning model; generating a nonlinear optimal solution based on the multi-agent deep reinforcement learning model with the joint optimization objectives of minimizing the moisture content fluctuation compensation delay time and minimizing the energy consumption of the field actuator, wherein the moisture content fluctuation compensation delay time is calculated based on the multi-source sensor data stream; and converting the nonlinear optimal solution into the first control command.

[0119] This embodiment provides a step for generating a nonlinear optimal control law. Specifically, in the previous layer scheme, it can be determined whether the current safe time window is sufficient to support better control. When the system is still in the controllable range, if an overly conservative fixed control method is continued, the fluctuation of raw material properties cannot be compensated in time, affecting the stability of continuous casting. Therefore, this embodiment introduces a multi-agent deep reinforcement learning model to pursue higher dynamic adjustment performance while ensuring that the risk does not exceed the limit.

[0120] The specific processing steps are as follows: The system preloads a trained multi-agent deep reinforcement learning model; this model can contain multiple cooperative agents, one of which is responsible for the water addition adjustment strategy, another for the stirring rate adjustment strategy, and an auxiliary agent can be set to coordinate energy consumption and execution smoothness; each agent receives a common environmental state and also receives its own relevant local observations; for example, the water addition agent pays more attention to the raw material moisture content, flow feedback and valve opening, while the speed agent pays more attention to the motor current, current speed and mixing load changes;

[0121] The moisture content fluctuation compensation delay time can be calculated from the multi-source sensor data stream. A simplified method is as follows: when the raw material moisture content increases or decreases significantly at a certain moment, the system records the time elapsed from that moment until the field actuator adjusts its output to bring the critical state back into the steady-state threshold. For example, if the moisture content is detected to increase from 7.0% to 8.3% in the 200th cycle, and the combined effect of water addition and rotation speed causes the overall state to return to the steady-state zone after the 203rd cycle, then the compensation delay time is 3 cycles.

[0122] In the training and inference of multi-agent deep reinforcement learning models, the reward function is defined as follows:

[0123]

[0124] in, This indicates the delay time for compensating for moisture content fluctuations. Indicates the energy consumption of the actuator. and The weighting coefficients are preset; this joint optimization objective guides each agent to minimize the compensation delay time while ensuring controllability, and at the same time minimize the frequency of pump and valve actions and motor energy consumption.

[0125] During the model inference phase, the system inputs the current multi-source sensor data stream into each agent to obtain their respective candidate actions. For example, if the action command output by the water-adding agent in the current state is to increase the valve opening by 4%, and the action command output by the speed-increasing agent is to increase the speed by 2 revolutions per minute, then the coordination logic built into the multi-agent deep reinforcement learning model will synthesize the actions according to the joint optimization objective to obtain a nonlinear optimal solution. This solution is not a single fixed formula, but a combination of actions that are related to the current environmental state, risk margin, and the load of each actuator. The system converts this nonlinear optimal solution into a first control command, such as adjusting the target valve opening from 42% to 46% and adjusting the mixer speed from 36 revolutions per minute to 38 revolutions per minute.

[0126] As a fault-tolerance mechanism, if there is an execution conflict in the current output of the model, such as the control quantity increment of the water addition action exceeding the preset limit, while the speed reduction action simultaneously suggests a reduction exceeding the preset limit, which may lead to further instability of the on-site state, the coordination logic will perform secondary trimming of the action according to the preset safety rules.

[0127] If an input channel of a certain agent fails, the output of that agent will be replaced by the most recent stable value or the degenerate control value. However, as long as the safe time window is still greater than the danger threshold, the system can still maintain the optimal control mode within the available information range. If the model output action exceeds the physical limit of the actuator, it will be forcibly projected to the allowable range, such as limiting the valve opening to 0% to 100% and the speed to the rated range of the equipment.

[0128] For example, in the aforementioned continuous pouring task, the system detected that the moisture content of a batch of environmentally friendly lightweight solid waste raw materials fluctuated significantly within 10 minutes after feeding, but the state assessment entropy remained at around 0.55, and the safe time window was greater than 40 minutes. At this time, the system continued to use the optimal control mode. The water addition agent suggested increasing the valve opening by 3% based on the moisture content change trend, and the speed agent suggested increasing the mixer speed by 1.5 rpm. After execution, the moisture content fluctuation compensation delay time was shortened from 4 cycles in the previous batch to 2 cycles, and the average current of the motor increased only slightly, indicating that the system could balance response speed and execution energy consumption before triggering a risk crisis.

[0129] The purpose of this step is to fully utilize data-driven adaptive optimization capabilities within the limits of risk tolerance, thereby achieving rapid compensation for raw material fluctuations and coordinated control of execution resources.

[0130] Triggering the control degradation mode and activating the preset suboptimal robust control law to generate a second control command for adjusting the water addition or stirring rate includes: freezing the adaptive weights of the multi-agent deep reinforcement learning model; and extracting the conservative dead zone control parameters corresponding to the control degradation mode from the preset system configuration library.

[0131] Based on conservative dead-zone control parameters, the joint optimization objective is abandoned, and a suboptimal robust control law with minimizing state evaluation entropy as the single objective is constructed. The suboptimal robust control law is executed to generate a second control command. The second control command reduces the response frequency to fluctuations in multi-source sensor data streams.

[0132] This embodiment provides a suboptimal robust control step in a control degradation mode. Specifically, the previous-level scheme has good adaptive performance when the risk margin is sufficient. However, when sea fog reversal, strong sensor noise, and frequent manual intervention occur, continuing to rely on the highly sensitive nonlinear optimal solution may cause frequent oscillations of the valve and motor, thereby disrupting the closed-loop continuity. Therefore, this embodiment pauses the optimization calculation when the safe time window narrows to the dangerous range and automatically switches to the suboptimal robust control mode with higher priority for system stability constraints.

[0133] The specific processing procedure is as follows: Once the system determines that the safe time window is less than the danger threshold, it immediately freezes the adaptive weights of the multi-agent deep reinforcement learning model; freezing here means stopping online adaptive updates and locking the current inference weights to prevent the model from continuing to deviate under unreliable data; subsequently, the system extracts conservative dead zone control parameters that match the current degradation level from the preset configuration library; the configuration library can be divided into multiple levels according to the degree of risk, such as mild crisis, moderate crisis, and severe crisis, which correspond to different dead zone widths, minimum action intervals, and maximum adjustment slopes, respectively;

[0134] To illustrate with a simplified example, if the valve opening response threshold to moisture content deviation is ±0.2% in normal mode and ±0.6% in degraded mode, the system will only activate when the deviation exceeds the conservative dead zone of the preset normal mode threshold. Similarly, while normal mode allows valve opening updates once per cycle, degraded mode allows updates only once every three cycles, with each opening change not exceeding 2%. Likewise, a lower adjustment frequency and a smoother rate of change can be set for the stirring rate; thus, even if the sensor input fluctuates significantly in the short term, the control command will not immediately reverse.

[0135] Regarding the control objective, the system abandons the joint optimization of moisture content fluctuation compensation delay and energy consumption, and instead constructs a suboptimal robust control law with minimizing state assessment entropy as the single objective. This control law prioritizes suppressing uncertainty propagation and restoring command stability as its primary optimization objectives, while reducing the weight of instantaneous adjustment accuracy. When generating the second control command, it can make only conservative unidirectional corrections based on the current deviation direction. For example, when the overall status indicates that the raw material is too wet, the system does not continuously fine-tune the valve, but instead reduces the preset fine-tuning step size of the water addition at once after confirming that the material is continuously too wet for multiple cycles, and maintains an observation period.

[0136] As a fault-tolerance mechanism, if the system detects that the state evaluation entropy is still rising rapidly after entering the degraded mode, it can switch to a higher level of conservative parameters, such as expanding the dead zone and extending the action interval; if an actuator in the degraded mode is close to its physical limit, the action of that actuator will be constrained first, and the adjustment task will be allocated more to another actuator; if the corresponding level of parameters is missing in the configuration library, the most conservative default parameter set will be used to ensure that the system prioritizes maintaining stability rather than continuing to test.

[0137] For example, in the aforementioned scenario of sudden sea fog at night, the safe time window was further shortened from 24 minutes to 14 minutes, below the 20-minute threshold. The system immediately froze the online update of the reinforcement learning model, invoked the parameters of the moderate crisis level, expanded the valve action dead zone from ±0.2% to ±0.7%, changed the action interval from 1 cycle to 3 cycles, and limited the amount of change in rotational speed to within 1 revolution / minute. Although the compensation speed for raw material fluctuations decreased at this time, the valve no longer oscillated frequently in each cycle, the actuator action tended to be stable, and the frequency of manual intervention decreased. The characterization control degradation effectively suppressed the expansion of system risk.

[0138] The purpose of this step is to proactively abandon local optima under high-risk conditions in exchange for the sustainable operation of the closed-loop system and the interpretability of instructions, thereby suppressing the probability of global collapse.

[0139] The method for collecting the feedback status of the field actuator and recalculating the state assessment entropy includes: monitoring whether a manual overrun control signal is generated in the field actuator; when no manual overrun control signal is detected, the frequency of manual intervention is recorded as zero, and the mechanical execution feedback data generated by the field actuator is directly collected as the feedback status; when a manual overrun control signal is detected, the frequency of manual intervention is recorded, the control quantity contained in the manual overrun control signal is extracted, and the control quantity is weighted and concatenated with the mechanical execution feedback data generated by the field actuator to generate the feedback status; the feedback status is input into the control state assessment model, and the updated state assessment entropy is recalculated.

[0140] This embodiment provides a feedback state reconstruction step that includes human overshoot factors; specifically, although the command jump has been reduced in the aforementioned degradation mode, manual pulse control, manual rewriting of valve opening, or forced maintenance of a certain speed often still occur in engineering sites.

[0141] If this human factor is ignored in the feedback state, the control system may misjudge the state change caused by human intervention as the response deviation of the actuator itself, thereby amplifying the model deviation incorrectly; therefore, this embodiment incorporates the human overrun control signal into the feedback reconstruction process.

[0142] The specific processing procedure is as follows: The system continuously monitors the control source identifier of the field actuator; this identifier can be obtained from the programmable logic controller mode bit, the manual / automatic switching relay status, or the operator console button log; when no manual over-control signal is detected, the system records the frequency of manual intervention as zero and directly collects mechanical execution feedback data as the feedback status; for example, reading the actual valve opening, cumulative water volume, actual motor speed, current value, etc., and splicing them to form a feedback vector;

[0143] When a manual overrun control signal is detected, the system records the frequency of manual intervention and its control quantity. The frequency of manual intervention can be counted within a sliding window. For example, if there are 3 manual jogs in the last 10 minutes, the frequency is 3. The control quantity includes manually set valve opening changes, forced speed setpoints, or start / stop commands.

[0144] The system weights and concatenates the control quantity with the mechanical execution feedback data to generate a new feedback state. The weighting reflects how much of the actual execution result comes from human intervention. For example, if the mechanical feedback shows that the valve's actual opening is 48% in a certain cycle, and the operator manually changes the target opening to 52% in that cycle, and if the human priority weight is set to 0.6 and the mechanical feedback weight is set to 0.4, then the concatenated result will be a comprehensive feedback quantity. This amount is closer to the actual control intent on site.

[0145] The reconstructed feedback state is then input into the control state assessment model to recalculate the updated state assessment entropy. In this way, the state assessment model not only perceives the equipment execution results but also the human-machine hybrid control state. If human intervention is frequent, even if the mechanical feedback itself is stable, the model will include this phenomenon of closed-loop disruption in the risk assessment.

[0146] As a fault-tolerant mechanism, if the manual overtaking control signal only has a mode switching indicator without precise control quantity, the system can embed it as a discrete event into the feedback state. For example, 1 indicates that manual takeover has occurred in this cycle, and 0 indicates that it has not occurred. If the manual control quantity and the mechanical feedback timestamp are not synchronized, they are spliced ​​together using the nearest neighbor time alignment method. If the manual signal is missing but the mechanical feedback deviates significantly from the automatic command, the system marks the cycle as suspected manual intervention and includes it in subsequent recalculations with a lower confidence level to avoid human influence due to log lag and omission.

[0147] For example, during the 8th hour of continuous pouring, the operator observed frequent valve movements and manually increased the valve opening twice via the on-site control panel, each time for approximately 20 seconds. After detecting the change in manual / automatic switching, the system recorded the frequency of manual intervention in the last 10 minutes as 2, and weighted and fused the manually set opening value with the valve position feedback. After the updated feedback state was input into the evaluation model, the state evaluation entropy did not decrease as the result of pure mechanical feedback calculation, but remained at a high level. This result is consistent with the actual situation on site, because human-machine hybrid control means that closed-loop consistency has not yet been truly restored.

[0148] The purpose of this step is to explicitly incorporate the disruption to control continuity caused by human intervention into the feedback assessment, thereby achieving a complete characterization of the actual field control state and avoiding the system's misjudgment that it has returned to stability.

[0149] If the recalculated state evaluation entropy converges to the preset safe interval, the methods to restore the nonlinear optimal control law include: comparing the updated state evaluation entropy with the upper and lower limits of the preset safe interval; if the updated state evaluation entropy is greater than the lower limit of the preset safe interval but less than the upper limit of the preset safe interval, then the system confidence is determined to be restored; unfreezing the adaptive weights of the multi-agent deep reinforcement learning model; reactivating the joint optimization objective and restoring the output of the first control command.

[0150] This embodiment provides a switching step from a degraded mode to an optimal control mode. Specifically, in the previous scheme, the system is already able to sense the real feedback state after human intervention. However, if the system immediately exits the conservative mode just because the risk has slightly decreased in a certain period, it is easy to cause repeated jitters of instability after recovery. Therefore, this embodiment sets a safe range and uses the reconvergence of the state evaluation entropy as the recovery condition.

[0151] The specific processing procedure is as follows: The system first compares the updated state assessment entropy with the upper and lower limits of the preset safety interval. The safety interval can be set based on historical statistics and field experience, for example, the lower limit is 0.35 and the upper limit is 0.65. Setting upper and lower limits instead of a single threshold is to avoid frequent switching when the system is too close to the boundary. If the updated state assessment entropy is greater than the lower limit and less than the upper limit, the system confidence is determined to have recovered. If necessary, a continuous period condition can be added, for example, recovery is only confirmed after three consecutive periods fall within the safety interval, in order to further enhance stability.

[0152] After confirming recovery, the system unfreezes the adaptive weights of the multi-agent deep reinforcement learning model and reactivates the previous joint optimization objective. This unfreezing can be done in stages, for example, first restoring the inference output and then restoring online fine-tuning within a subsequent stable window; or it can be unfrozen all at once, but with a smaller learning rate. Afterward, the system resumes outputting the first control command, so that the control logic once again pursues a balance between the speed of moisture content fluctuation compensation and the execution energy consumption.

[0153] To illustrate with an example, if the state evaluation entropy for four consecutive cycles in degraded mode is 0.72, 0.66, 0.61, and 0.58 respectively, and the safe interval is... If the condition is met for two consecutive cycles, the system has entered the safe zone starting from the third cycle. If the condition is set to be met for two consecutive cycles, the recovery can be determined in the fourth cycle. The system then removes the conservative dead zone parameter and restores the continuous adjustment control quantity output by the nonlinear optimal control law given by the reinforcement learning model.

[0154] As a fault-tolerance mechanism, if the state assessment entropy is exactly equal to the upper or lower limit of the safe range, recovery is not immediately determined, but the current mode is maintained for an additional observation period; if the entropy value enters the safe range, but the frequency of human intervention is still higher than the preset upper limit, the system can delay unfreezing and wait for the human-machine control to be unified again; if the entropy value rises significantly again in the first period after recovery, the system immediately reverts to the conservative mode to prevent secondary instability after a brief recovery.

[0155] For example, within 20 minutes after the sea fog subsided, the frequency of valve operation decreased, the operator stopped manual jogging, and the updated state assessment entropy gradually decreased from 0.79 to 0.63, 0.60, and 0.57. Since it fell within the safe range for several consecutive cycles, the system determined that the confidence level had recovered. Therefore, the controller canceled the conservative dead zone constraint, unfroze the reinforcement learning model, and re-output more detailed instructions for the coordinated adjustment of water supply and rotation speed. On-site observation showed that the water supply compensation speed recovered, but the previous high-frequency oscillations no longer occurred.

[0156] The purpose of this step is to provide clear and verifiable recovery conditions for the system to exit conservative control, thereby achieving a smooth transition from survival-first to performance-first priorities.

[0157] If the recalculated state evaluation entropy does not converge to the safe interval, the methods for maintaining the suboptimal robust control law and updating the control state evaluation model include: if the updated state evaluation entropy is greater than or equal to the upper limit of the preset safe interval, or less than or equal to the lower limit of the preset safe interval, then the abnormal state of system confidence is determined to continue; maintain the output of the second control command; use the product of the frequency of manual intervention and the data consistency anomaly factor as a penalty term to construct a penalty function; use the penalty function to backpropagate and correct the evaluation weights of the control state evaluation model to complete the update of the control state evaluation model;

[0158] This embodiment provides a model update step during a crisis. Specifically, in the previous solution, the system can restore normal control when the risk decreases. However, if the risk does not return to the safe range, simply maintaining a conservative output is not enough, because the original state assessment model may no longer be suitable for the current noise, latency, and human intervention conditions. Therefore, this embodiment introduces a penalty function that includes the frequency of human intervention and the intensity of abnormal data consistency during a crisis to correct the weights of the assessment model.

[0159] The specific processing procedure is as follows: The system first determines whether the updated state assessment entropy is still outside the safe range; if it is greater than or equal to the upper limit, it indicates that the risk is too high; if it is less than or equal to the lower limit, it indicates that another type of anomaly may occur, namely, the model's risk response gain is too low, the control command output is lagging, and it fails to truly reflect the on-site disturbance; both situations can be identified as a continued confidence crisis; at this time, the system continues to maintain the second control command to avoid resuming the output of the first control command before the model has been corrected.

[0160] In model updates, the system constructs a penalty function by using the product of the frequency of human intervention and the data consistency anomaly factor as a penalty term. The specific principle is as follows: a high frequency of human intervention indicates a decrease in the reliability of the current automatic control commands and a deviation from expected control effects; a high data consistency anomaly factor indicates significant conflicts between sensors; when both are high, it indicates that the model's identification and evaluation of complex anomalies are insufficient, requiring a corresponding increase in the correction magnitude of the evaluation weights. For example, if the frequency of human intervention in the last 10 minutes is 4 and the data consistency anomaly factor is 1.2, then the penalty term is 4.8; if the frequency of human intervention in another period is 1 and the data consistency anomaly factor is 0.6, then the penalty term is only 0.6. Thus, under the complex working conditions of multi-source data conflict and frequent human intervention, the system adaptively increases the correction gradient of the model weights.

[0161] The penalty function, together with the prediction error between the predicted state assessment entropy of the control state assessment model and the actual risk label, constitutes the loss term. This loss term is then corrected through backpropagation to adjust the assessment weights of the control state assessment model. Specifically, the total loss function is defined as:

[0162]

[0163] in, To control the prediction error between the predicted state assessment entropy and the actual risk label of the state assessment model, such as the mean square error between the predicted state assessment entropy and the actual risk label of the system, in the online update scenario, the actual risk label of the system is dynamically generated by collecting the fluctuation amplitude of the field actuator and the manual intervention shutdown event with a lag of a preset time window and using a preset posterior calibration rule. Specifically, the posterior calibration rule is as follows: extract the extreme value of the actuator fluctuation amplitude and the duration of the manual intervention shutdown within the lag time window, assign preset risk benchmark weights to them respectively, and then linearly weight the sum of the two to convert them into a quantified actual risk label;

[0164] As a penalty weighting coefficient, This is a penalty term; for example, the original model outputs an entropy value of 0.58, but the on-site crisis clearly persists, and the total loss is calculated by incorporating the penalty term. The total loss increases significantly. After the network updates by backpropagation based on this total loss, it will be more sensitive to the risk of high misalignment and multiple human intervention modes, so that the output entropy value will be earlier and higher in similar subsequent working conditions, thereby triggering conservative control in advance.

[0165] As a fault-tolerance mechanism, if the frequency of manual intervention is zero, the penalty term will naturally be zero, and the model will only be updated based on the ordinary evaluation error. If the data consistency anomaly factor is abnormally large and exceeds the preset upper limit, it will be truncated to the upper limit value before participating in the calculation to avoid a single anomaly causing a sharp shift in weights. If the system is in a serious crisis and online updates are not allowed, the samples can be cached first, and the model can be updated in batches after entering a relatively stable window.

[0166] For example, within 30 minutes after the aforementioned sea fog reversal, although the system had switched to conservative mode, the operator manually rewrote the data three times due to the still unstable local environment. Furthermore, the multi-source data deviation rate remained consistently high, causing the updated state assessment entropy to remain above 0.71, failing to enter the safe zone. Based on this, the system determined the crisis continued and continued to output the second control command. Simultaneously, the frequency of manual intervention (3) was multiplied by the data consistency anomaly factor (1.1) to obtain a penalty term (3.3), which was used to correct the assessment model. After several rounds of updates, the model could identify similar scenarios of sudden humidity changes and delayed flow feedback earlier, enabling it to enter conservative control earlier in subsequent similar events and reducing on-site manual shutdown.

[0167] The purpose of this step is to enable the condition assessment model to self-correct its sensitivity to complex field factors during ongoing crises, thereby achieving dynamic adjustment of control assessment capabilities as operating conditions change.

[0168] This embodiment provides a specific scenario limitation method for coastal environmental disturbance characteristics. Specifically, in the aforementioned scheme, coastal environmental disturbance characteristics can cover multiple types of environmental events. In this embodiment, the disturbance is further limited to the physical property jump characteristics caused by the reversal of coastal sea fog microclimate. The reason for this is that sea fog reversal is not an ordinary environmental fluctuation, but a composite event that can change the surface state of raw materials, sensor response rhythm and local spatial humidity and heat distribution in a short period of time, and has distinct coastal engineering characteristics.

[0169] The specific processing procedure is as follows: Among them, sea fog microclimate reversal refers to the reversal of the direction or abrupt change in intensity of the temperature and humidity gradient in a local area within a short period of time; for example, the humidity outside the silo was originally lower than inside, but due to the entry of sea fog, the humidity outside quickly became higher than inside, causing the environmental conditions of the silo opening, conveying area and control cabinet area to be out of sync; this reversal will further lead to an increase in the adsorbed moisture on the surface of the raw materials, deviation of the readings of contact and non-contact sensors, and a sudden change in the correlation between moisture content and flow feedback; the system therefore summarizes such events as physical property jump characteristics;

[0170] In terms of implementation, the system can use the following characteristics as the basis for identification: the amplitude of temperature and humidity changes in a short period of time exceeds the threshold; the direction of temperature and humidity differences in different spatial locations reverses; the change in raw material moisture content and the feedback of water addition are lagging and misaligned; the actuator is in a normal response state but the state assessment entropy is significantly increased; after these conditions are met, the system not only records ordinary environmental disturbances, but also marks them as sea fog microclimate reversal events and assigns them higher risk prior weights.

[0171] For example, if the humidity at the silo opening rises from 84% to 96%, the humidity near the control cabinet rises from 81% to 86%, the temperature drops from 28℃ to 26℃, and the moisture content rises from 7.1% to 8.0% after two cycles, while the flow feedback change remains small, the system can mark this event as a physical property jump caused by a reversal of the sea fog microclimate. Unlike the regular humidity rise, which has a low rate of change, this type of event is more likely to trigger multi-source conflicts and human distrust, so a higher disturbance quantification value can be given in subsequent state assessments.

[0172] As a fault-tolerance mechanism, if only a single location humidity increase is detected without spatial reversal characteristics, it can be classified as a general humid and thermal disturbance and not directly marked as this specific event; if the sea fog reversal characteristics are partially met but the duration is less than the preset time window limit, such as only one sampling cycle, the system first enters the pending confirmation state, and decides whether to confirm it as a physical property jump event after verification in the next cycle; if the environmental sensor data is incomplete, auxiliary judgment is allowed based on the sudden increase in raw material moisture content, surface condensation monitoring signals, or on-site sea fog records in the operation log;

[0173] For example, during the early morning hours of continuous pouring of offshore wind turbine foundations, the sea wind direction suddenly changes, and sea fog flows back into the silo along one side, causing the humidity and temperature at the silo opening to jump by more than the preset rate of change threshold within a few minutes, and the surface state of the raw materials changes rapidly. After the system identifies this event as a change in the physical properties of the sea fog microclimate reversal, it significantly increases the risk quantification weight of the misalignment of multi-source data and starts the risk time window calculation and control degradation logic earlier, eliminating the risk of control degradation lag caused by misjudging it as normal environmental white noise.

[0174] The purpose of this step is to concretize the typical high-risk coastal disturbances that this system is designed for, so that the control architecture can form a specific response to the most destructive microclimate events, thereby achieving more scenario-specific risk management.

[0175] It should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. An automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power, applied to a concrete preparation platform including a sensor array and field actuators, characterized in that, include: Multi-source heterogeneous data acquisition unit: The sensor group acquires multi-source sensor data streams including the moisture content of light solid waste raw materials, the operating status of the mixer, and the temperature and humidity of the coastal environment. Extract coastal environmental disturbance features from the multi-source sensor data stream; State assessment and risk quantification unit: Based on the multi-source sensor data stream and the coastal environmental disturbance characteristics, construct a control state assessment model based on a neural network; calculate the state assessment entropy through the control state assessment model according to the time series; and predict the safe time window of the current control system from the preset failure boundary of the cold joint in the casting process based on the state assessment entropy. Dynamic risk hedging decision unit: compares the safe time window with a preset danger threshold; If the safe time window is greater than or equal to the danger threshold, then the nonlinear optimal control law is activated to generate a first control command for adjusting the amount of water added or the stirring rate. If the safe time window is less than the danger threshold, the control degradation mode is triggered, and the preset suboptimal robust control law is activated to generate a second control command for adjusting the amount of water added or the stirring rate. The methods for extracting coastal environmental disturbance features from the multi-source sensor data stream include: Identify temperature and humidity abrupt change signals in the multi-source sensor data stream; Acquire spatiotemporal misalignment data of heterogeneous sensors from the multi-source sensor data stream; The temperature and humidity abrupt change signal is fused with the spatiotemporal misalignment data from the heterogeneous sensor to generate the coastal environmental disturbance characteristics; The coastal environmental disturbance characteristics are physical property jumps caused by the reversal of coastal sea fog microclimate; The method for calculating the state evaluation entropy using the control state evaluation model includes: The multi-source sensor data stream is mapped to a unified dimensionless feature space to generate feature vectors, a preset reference consistency vector is obtained, and the feature vector numerical deviation rate in the dimensionless feature space is calculated based on the difference between the feature vector and the reference consistency vector. When the deviation rate of the feature vector is less than the preset deviation threshold, the data is determined to be consistent. After converting each feature value in the feature vector into a weighted distribution, the information entropy of the feature vector in the dimensionless feature space is calculated as the basic entropy value, and the data consistency anomaly factor is set to zero. When the deviation rate of the feature vector is greater than or equal to the deviation threshold, the data is determined to be contradictory. The information entropy of the feature vector in the dimensionless feature space is calculated as the basic entropy value, and the ratio of the deviation rate of the feature vector to the deviation threshold is extracted as the data consistency anomaly factor. The state assessment entropy is obtained by linearly weighting and summing the basic entropy value, the data consistency anomaly factor, and the quantized value obtained by quantifying the coastal environmental disturbance characteristics; wherein, the state assessment entropy is used to characterize the degree of anomaly in the confidence of the control system command. The method for predicting the safe time window of the current control system from the preset failure boundary of the cold joint in the pouring process based on the state evaluation entropy includes: Obtain system operation history logs; Based on the system's historical log, extract the time series of critical entropy values ​​corresponding to the failure boundary of the cold joint in the casting process. The difference between the current state assessment entropy and the previous state assessment entropy recorded in the system operation history log is calculated and divided by the sampling time interval corresponding to the multi-source sensor data stream to obtain the entropy deterioration rate. When the entropy deterioration rate is greater than a preset small fluctuation threshold, the difference between the critical entropy value corresponding to the critical entropy value in the critical entropy value time series and the state evaluation entropy at the current moment is divided by the entropy deterioration rate to calculate the remaining time to approach the failure boundary of the cold joint, which is used as the safe time window.

2. The automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power according to claim 1, characterized in that, It also includes a system confidence repair and closed-loop unit: sending the first control command or the second control command to the field actuator; collecting the feedback status of the field actuator and recalculating the state evaluation entropy; if the recalculated state evaluation entropy converges to a preset safe interval, then restoring the nonlinear optimal control law; if the recalculated state evaluation entropy does not converge to the safe interval, then maintaining the suboptimal robust control law and updating the control state evaluation model.

3. The automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power according to claim 1, characterized in that, The method of activating the nonlinear optimal control law to generate the first control command for adjusting the water addition or stirring rate includes: Load a pre-defined multi-agent deep reinforcement learning model; The multi-source sensor data stream is input into the multi-agent deep reinforcement learning model; Based on the multi-agent deep reinforcement learning model, a nonlinear optimal solution is generated with the joint optimization objectives of minimizing the moisture content fluctuation compensation delay time and minimizing the energy consumption of the field actuator. The moisture content fluctuation compensation delay time refers to the period from when the multi-source sensor data stream detects a sudden change in the moisture content of the light solid waste raw material until the field actuator adjusts its output to bring the data items reflecting the moisture content of the light solid waste raw material in the multi-source sensor data stream back to the steady-state threshold band pre-calibrated based on historical stable operation data. The nonlinear optimal solution is converted into the first control command.

4. The automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power according to claim 3, characterized in that, The trigger control degradation mode, which activates a preset suboptimal robust control law to generate a second control command for adjusting the water addition or stirring rate, includes the following methods: Freeze the adaptive weights of the multi-agent deep reinforcement learning model; Extract the conservative dead zone control parameters corresponding to the control degradation mode from the preset system configuration library; Based on the conservative dead zone control parameters, the joint optimization objective is abandoned, and a suboptimal robust control law is constructed with minimizing the state evaluation entropy as the single objective. The suboptimal robust control law is executed to generate the second control command; wherein the second control command reduces the response frequency to fluctuations in the multi-source sensor data stream.

5. The automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power according to claim 4, characterized in that, The method of collecting the feedback status of the field actuator and recalculating the status assessment entropy includes: Monitor whether a manual overrun control signal is generated in the field actuator; When no manual overtaking control signal is detected, the frequency of manual intervention is recorded as zero, and the mechanical execution feedback data generated by the field actuator is directly collected as the feedback status. When the manual overtaking control signal is detected, the frequency of manual intervention is recorded, the control quantity contained in the manual overtaking control signal is extracted, and the control quantity is weighted and spliced ​​with the mechanical execution feedback data generated by the field actuator to generate the feedback state. The feedback state is input into the control state evaluation model, and the updated state evaluation entropy is recalculated.

6. The automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power according to claim 5, characterized in that, If the recalculated state evaluation entropy converges to a preset safe interval, the method for restoring the nonlinear optimal control law includes: Compare the updated state evaluation entropy with the upper and lower limits of the preset safety interval; If the updated state evaluation entropy is greater than the lower limit of the preset safety interval and less than the upper limit of the preset safety interval, then the system confidence is determined to have recovered. Unfreeze the adaptive weights of the multi-agent deep reinforcement learning model; Reactivate the joint optimization objective and resume outputting the first control command.

7. The automatic control system for the preparation of environmentally friendly lightweight concrete for coastal wind power according to claim 6, characterized in that, If the recalculated state evaluation entropy does not converge to the safe interval, the method of maintaining the suboptimal robust control law and updating the control state evaluation model includes: If the updated state evaluation entropy is greater than or equal to the upper limit of the preset safety interval, or less than or equal to the lower limit of the preset safety interval, then the abnormal state of the system confidence is determined to continue. Maintain the output of the second control command; The product of the frequency of manual intervention and the data consistency anomaly factor is used as a penalty term to construct a penalty function; The prediction error between the state evaluation entropy output by the control state evaluation model and the actual risk label dynamically calculated by the posterior calibration rule, combined with the penalty function, constitutes a loss function. The penalty function is then used for backpropagation to correct the evaluation weights of the control state evaluation model, thereby completing the update of the control state evaluation model.