A control logic runtime verification and security recovery method for embedded real-time systems

By building a security monitoring layer and formal runtime contracts outside the operating system kernel, and combining graded fault recovery and hardware-level spatiotemporal isolation, the monitoring and recovery problems of existing embedded real-time systems are solved, achieving non-intrusive monitoring and high-precision fault recovery, and meeting functional safety standards.

CN122308328APending Publication Date: 2026-06-30CHINA YANGTZE POWER

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
CHINA YANGTZE POWER
Filing Date
2026-03-31
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing control logic monitoring and fault recovery solutions for embedded real-time systems suffer from problems such as intrusive monitoring resource consumption, inability to accurately verify security invariance and timing compliance, limited fault recovery strategies, and inability to meet functional safety standards.

Method used

A security monitoring layer is built outside the operating system kernel. Through independent monitoring, formal runtime contracts, hierarchical fault recovery, and hardware-level spatiotemporal isolation, non-intrusive monitoring, accurate verification, and progressive recovery are achieved. Combined with memory protection units and real-time scheduling strategies, system stability and security are ensured.

Benefits of technology

It achieves non-intrusive monitoring, improves the accuracy and comprehensiveness of control logic operation, solves the problem of insufficient rationality of fault recovery strategies, meets functional safety standards, and improves the stability and reliability of the system.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122308328A_ABST
    Figure CN122308328A_ABST
Patent Text Reader

Abstract

This invention discloses a method for runtime verification and safety recovery of control logic in embedded real-time systems, relating to the fields of embedded real-time control and functional safety technology. This invention constructs an independent safety monitoring layer outside the operating system kernel, defines runtime contracts containing safety invariance and timing logic constraints for critical control tasks, and performs real-time contract verification through periodic data collection. When a contract violation is detected, a layered recovery mechanism is initiated according to fault levels, including output clamping, task rollback and restart, and algorithm degradation switching. A memory protection unit is used to achieve spatiotemporal isolation protection for tasks. This invention achieves non-intrusive real-time monitoring and hierarchical safety recovery of critical control tasks, improving the functional safety and operational stability of embedded real-time systems. The monitoring overhead is controllable and does not affect real-time system scheduling, making it widely applicable to safety-critical scenarios such as vehicle control and industrial robots.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of embedded real-time control system technology, specifically to a method for runtime verification and security recovery of control logic for embedded real-time systems. Background Technology

[0002] With the deep integration of IoT technology and intelligent manufacturing, embedded real-time systems in fields such as industrial control, aerospace, and autonomous driving are showing a significant upward trend in both functional complexity and performance indicators. These systems undertake critical control decision-making and execution functions, and their operational safety, reliability, and real-time response capabilities directly affect the stability of equipment operation and even the safety of personnel and property. However, existing technologies still face many pressing technical challenges in ensuring the continuous and safe operation of such systems.

[0003] Existing control logic monitoring and fault recovery solutions for embedded real-time systems have many shortcomings: The monitoring module is integrated into the operating system kernel, which is an intrusive monitoring method. It is prone to consuming system resources and interfering with the real-time scheduling of critical tasks, and cannot achieve independent monitoring. Lacking formal, machine-readable runtime contract constraints, it can only implement simple threshold judgments and cannot accurately verify the safety invariance and timing compliance of control logic; The fault recovery mechanism is simplistic and does not classify and handle faults according to their severity. Minor faults are prone to triggering over-recovery, while severe faults cannot provide effective safety nets. Without hardware-based spatiotemporal isolation between tasks, storage anomalies and execution timeouts in a single task can easily propagate to the entire system, triggering a chain reaction of failures. The monitoring logic has not been optimized for overhead, and the additional computational load affects the system's real-time performance, failing to meet the functional safety standards such as ISO26262.

[0004] To address the aforementioned shortcomings, this invention provides a method for runtime verification and security recovery of control logic in embedded real-time systems, thereby overcoming the deficiencies of existing technologies. Summary of the Invention

[0005] The purpose of this invention is to provide a method for runtime verification and security recovery of control logic for embedded real-time systems, so as to solve the problems mentioned in the background art.

[0006] To achieve the above objectives, the present invention provides the following technical solution: A method for runtime verification and security recovery of control logic in embedded real-time systems includes the following steps: A security monitoring layer is built outside the operating system kernel. The security monitoring layer is implemented as a high-priority task or a dedicated monitoring coprocessor independent of the operating system kernel. It obtains the running status data of key control tasks in the system through inter-task communication interfaces, direct access to task control block data structures, dedicated bus and dual-port RAM sharing, and performs independent and non-intrusive real-time monitoring of key control tasks. Define a machine-readable runtime contract for each critical control task. The runtime contract contains predefined safety invariant conditions and timing logic constraints that the task must satisfy. The safety monitoring layer collects the operational status data and output signals of key control tasks according to a predetermined monitoring cycle, and compares and verifies the collected data in real time with the safety invariance conditions and timing logic constraints specified in the runtime contract to determine whether the actual operation behavior of the task meets the requirements of the runtime contract. When a critical control task is detected to violate the runtime contract, the severity of the fault is assessed and classified according to the pre-defined fault classification rules. Based on the assessment results, the corresponding level of hierarchical recovery mechanism is activated. The hierarchical recovery mechanism includes output clamping, task-level state rollback and restart based on security snapshot, and switching of the control algorithm's degradation mode, in order of fault severity from mild to severe. By utilizing the built-in memory protection unit of the embedded processor, independent memory protection domains with spatiotemporal isolation are created for each critical control task. Hardware-enforced storage access permission control ensures that storage access anomalies in any task do not propagate and affect the data integrity of other tasks. Combined with real-time scheduling strategies, the execution time of each task is monitored and managed, achieving dual protection of tasks in both spatial and temporal dimensions.

[0007] Furthermore, the runtime contract is described using a formal specification language, the safety invariance condition specifies the allowed value range of the control variables and the state transition rules, and the timing logic constraint specifies the time interval requirements that must be met between key events and the event occurrence order constraint.

[0008] Furthermore, the formal description of the security invariance condition includes: for any time t, the control output variable Y(t) satisfies the constraint condition.

[0009] Divide into and These represent the preset lower and upper limits of the output, respectively; the timing logic constraints include: when a fault indication event e occurs, the system must respond within a specified time limit. Complete the predetermined safety response actions within the specified time.

[0010] Furthermore, the output clamping operation includes: when the safety monitoring layer detects that the output signal Y(t) of the critical control task exceeds the safety threshold range predefined by the runtime contract, it immediately performs boundary limiting processing on the output signal to forcibly constrain the output signal within the safety range; the output clamping operation is implemented through a hardware or software bypass mechanism inserted between the control task and the actuator interface, and the safety monitoring layer directly intercepts and modifies the control instructions sent to the actuator.

[0011] Furthermore, the task-level state rollback and restart based on security snapshots includes: during the normal operation phase of a critical control task, the security monitoring layer or the task itself periodically saves the key state data of the task to form a security snapshot according to a preset snapshot period. The security snapshot includes the task's program counter value, stack pointer, register context, key input / output variable values, and intermediate calculation state; a circular buffer or ping-pong buffer structure is used to store the state snapshot sequence within the most recent N snapshot periods; when a serious operational anomaly is detected in the task and the fault level reaches the threshold requiring rollback and recovery, the most recent security snapshot that has passed contract verification is selected from the corresponding buffer, the task's running context is completely restored to the state corresponding to the security snapshot, and then the task is rescheduled and restarted.

[0012] Furthermore, the degradation mode switching of the control algorithm includes: pre-designing a simplified safety mode algorithm as a backup control strategy for each critical control task. The simplified safety mode algorithm has lower computational complexity, more conservative control parameters, and more comprehensive formal verification coverage compared to the original control algorithm. When the security monitoring layer detects that a critical control task continues to violate the runtime contract after performing a rollback and restart, or when the nature of the contract violation is assessed as fatal, it sends a degradation command to the task through a reserved algorithm switching interface, triggering an online hot switch of the control algorithm. After receiving the degradation command, the control task saves its current input and output state, switches the control strategy from the original algorithm module to the simplified safety mode algorithm module, and continues to run in degradation mode.

[0013] Furthermore, the use of the memory protection unit to create independent memory protection domains with spatiotemporal isolation for each task includes: during the system initialization phase, pre-allocating an independent memory region for each critical control task and configuring the access permission attributes of this region through the memory protection unit; when any task attempts to access an unauthorized memory address during execution, the memory protection unit triggers a memory access violation hardware interrupt; after the security monitoring layer captures the interrupt signal, it identifies the task identifier where the violation occurred, isolates and suspends the task, and records the fault information; combined with a real-time scheduling strategy, a maximum continuous execution time threshold is set for each critical task.

[0014] The security monitoring layer monitors the actual execution time of each task through timer interrupts. When the continuous execution time of a task exceeds its maximum allowed threshold, it determines that the task has violated the real-time agreement in the timing contract, and actively stops and restarts the task.

[0015] Furthermore, the fault classification rules include: classifying faults into three levels—minor, moderate, and severe—based on the type and degree of contract violation. A minor fault is defined as one where the parameter exceeds the safe range by less than 5° of a preset deviation threshold and occurs less than 3 times in the last 10 monitoring cycles; a moderate fault is defined as one where parameter exceeds the limit or timing violation occurs for more than 20 milliseconds for 5 consecutive monitoring cycles; and a severe fault is defined as one where the contract is repeatedly violated or the fault is fatal after a task rollback and restart within 3 monitoring cycles. For minor faults, an output clamping operation is performed; for moderate faults, a task state rollback and restart are performed; and for severe faults, a control algorithm downgrade switch is performed. In accordance with the fault escalation principle, multiple recovery measures are continuously executed in a short period of time until the system returns to a safe operating state.

[0016] Furthermore, the verification of the timing logic constraints includes: maintaining an event timestamp record table in the security monitoring layer to record the occurrence time of each critical event; and when a new critical event is detected, calculating the actual time interval between the event and its related preceding events.

[0017] The time interval is compared and verified with the timing constraints specified in the runtime contract. If the actual time interval exceeds the allowed time window range, it is determined that the timing constraint is violated. For timing constraints that require verification of the order of events, a finite state automaton is used to model the expected event sequence. The security monitoring layer drives the state automaton to perform state transitions based on the actual observed event sequence. When the state machine enters the rejection state, it is determined that the event sequence constraint is violated.

[0018] Furthermore, the method also includes optimized control of runtime overhead: runtime contract monitoring is implemented only for control tasks identified as critical in the system, while no monitoring constraints are imposed on non-critical or auxiliary tasks; the runtime contract inspection logic is simplified and optimized, limiting verification operations to basic operations with constant time complexity such as threshold comparison, counter checking, and logical judgment; and worst-case analysis is performed on the execution time of the security monitoring layer itself to ensure that the additional overhead it introduces is within the acceptable range of the system and that its impact on real-time scheduling stability is predictable and controllable. Compared with the prior art, the beneficial effects of the present invention are: 1. This invention, through its overall technical solution of independent monitoring and deployment outside the kernel, formal runtime contract definition, hierarchical fault recovery, hardware-level spatiotemporal isolation, and optimized management of runtime overhead, can specifically address the core defects of existing technologies and has the following significant beneficial effects.

[0019] 2. This invention independently constructs a security monitoring layer outside the operating system kernel, and implements non-intrusive monitoring in the form of high-priority tasks or dedicated monitoring coprocessors. It obtains the running status of critical tasks through inter-task communication, direct access to task control blocks, dedicated buses and dual-port RAM sharing, etc. It decouples the monitoring logic from the business logic at the architectural level, completely avoids the problems of system resource occupation and interference with the real-time scheduling of critical tasks caused by the monitoring module intruding into the kernel, and ensures that the native real-time performance of the embedded real-time system is not affected.

[0020] 3. This invention uses a formal specification language to define a machine-readable runtime contract, while clearly defining security invariance conditions and timing logic constraints. It can perform real-time verification of the range of control variable values, state transition rules, key event time intervals and execution order in all dimensions. This breaks through the limitations of existing technologies that can only achieve simple threshold judgments and cannot complete complex security and timing compliance verification, and greatly improves the accuracy and comprehensiveness of control logic operation monitoring.

[0021] 4. This invention classifies faults into three levels—minor, moderate, and severe—based on their severity and matches them with a progressive, layered recovery mechanism that includes output clamping, task-level state rollback and restart, and control algorithm degradation switching. This not only avoids system turbulence caused by excessive recovery due to minor faults but also provides a final safety net for fatal faults. It solves the problems of single fault recovery strategies and insufficient rationality in existing technologies, minimizing the impact of faults on system operational safety.

[0022] 5. This invention utilizes the built-in memory protection unit of the embedded processor to create an independent memory protection domain with spatiotemporal isolation for critical control tasks. Combined with real-time scheduling strategies, it achieves hardware-level control over task execution time, blocking fault propagation paths from both spatial memory access and temporal execution duration dimensions. This ensures that problems such as abnormal storage access or execution timeout of a single task will not cause a chain reaction of failures throughout the system, significantly improving the operational stability and reliability of the embedded real-time system.

[0023] 6. This invention optimizes operating overhead by monitoring only critical control tasks, simplifying contract verification logic, and analyzing the worst-case execution time of the security monitoring layer. It keeps the additional monitoring load within an acceptable range for the system, and the impact on real-time scheduling is predictable and controllable. It can meet the requirements of functional safety standards such as ISO26262 and is suitable for the stringent usage conditions of safety-critical scenarios.

[0024] 7. The technical solution of this invention can be flexibly adapted to single-processor embedded architecture and heterogeneous multi-processor architecture. It can be applied to the vehicle control system of unmanned vehicles and deployed in the multi-axis coordinated control system of industrial robots. It can be implemented without large-scale modification of existing embedded real-time systems and has excellent versatility, scenario adaptability and engineering practical value. Attached Figure Description

[0025] Figure 1 This is a flowchart of the method steps for steering control of an unmanned vehicle according to Embodiment 1 of the present invention; Figure 2 This is a flowchart of the method steps for multi-axis coordinated control of an industrial robot according to Embodiment 2 of the present invention; Figure 3 This is a logic flowchart of the hierarchical fault recovery mechanism of the present invention. Detailed Implementation

[0026] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0027] The method in this embodiment is executed by a terminal, which can be a mobile phone, computer, PDA, laptop or desktop computer, etc. Of course, it can also be other devices with similar functions, and this embodiment does not limit them.

[0028] Example 1 See Figure 1 , Figure 3 This embodiment takes the steering control task in the embedded control system of an unmanned vehicle as an example to illustrate in detail the method for runtime verification and security recovery of control logic for embedded real-time systems provided by the present invention.

[0029] I. System Architecture and Security Monitoring Layer Deployment The embedded real-time system in this embodiment runs on an ARM Cortex-R series processor, which has a built-in memory protection unit (MP). The real-time operating system (RTOS) conforms to the ISO 26262 functional safety standard. A lightweight security monitoring layer is built outside the operating system kernel. Specifically, this security monitoring layer is implemented as a high-priority, for example, the lowest-priority independent monitoring task in the operating system. This monitoring task obtains the running status data of each critical control task through inter-task communication interfaces provided by the operating system, such as message queues and direct access to the task control block data structure. This data includes: the current task identifier, program counter value, stack pointer, key input variables, output variables, and task execution timestamps. The scheduling period of this monitoring task is set to 10 milliseconds, and its worst-case execution time, according to static analysis, is 0.5 milliseconds, far less than the scheduling period. Therefore, it will not have a perceptible impact on the real-time scheduling of other critical tasks.

[0030] II. Definition of Runtime Contracts During the system design phase, a machine-readable runtime contract is defined for the steering control task. This contract is described using the Linear Temporal Logic (LTL) formal specification language and includes two main categories: safety invariance conditions and temporal logic constraints.

[0031] Safety invariance conditions stipulate: the output value of the steering angle control command Must meet

[0032] That is, the limit range of angles that the vehicle's physical steering mechanism can safely withstand; the rate of change of steering angular velocity. Not exceeding When the vehicle is in high-speed driving mode (speed > 80 km / h), the steering angle output value must not exceed [the specified value]. These conditions are stored in the read-only configuration area of ​​the security monitoring layer in the form of a structured threshold configuration table.

[0033] Timing logic constraints stipulate that when the system detects a steering control instability indication event... After it occurs, it must be within 50 milliseconds Internal degrade control measures must be initiated; the time from receiving an emergency braking signal to the steering system entering a safety lock state must not exceed 100 milliseconds; the system's switching between standby mode and operating mode must follow the event sequence of "initialization self-test completed → self-test passed → allowed to enter operating mode". The above timing constraints are formally expressed using the LTL operators □always, ◇finally, and Uuntil.

[0034] III. Real-time monitoring and anomaly detection During normal vehicle operation, the safety monitoring layer acquires input sensor values ​​and output control commands for steering control tasks by directly reading the shared memory area, following a 10-millisecond monitoring cycle. After each data acquisition, the safety monitoring layer compares the real-time data with predefined safety thresholds in the runtime contract. First, check if the output value meets the requirements. ; Next, calculate the difference between the current output and the previous output to determine whether the rate of change of angular velocity exceeds [the specified value]. ; Then, based on the current vehicle speed, additional constraints in the high-speed mode are obtained by reading the vehicle speed sensor. Finally, by maintaining an event timestamp record table, the actual time interval between key events is calculated, and a finite state automaton is driven to verify whether the event sequence matches expectations.

[0035] If all checks pass, the task is considered to be running normally; if any check fails, the security monitoring layer immediately determines that the task has violated the contract.

[0036] IV. Fault Classification and Layered Recovery When the safety monitoring layer detects a breach of contract in the steering control task, a fault classification assessment is performed based on the type and severity of the breach: Minor malfunction: For example, the steering output command momentarily exceeds the safety threshold, but the extent of the exceedance is less than the preset deviation threshold. Furthermore, this anomaly has occurred less than 3 times in the last 10 monitoring periods. At this point, an output clamping operation is triggered: the safety monitoring layer uses a pre-deployed software bypass mechanism between the steering control task and the actuator interface to prune and limit the abnormal output value to a safe range. Then output ;like Then output This operation does not interrupt the normal execution flow of the control task, and dangerous outputs are suppressed immediately.

[0037] Medium-level faults: For example, steering output exceeding limits is detected for five consecutive monitoring cycles, or timing constraint violations last for more than 20 milliseconds. In this case, task-level state rollback and restart are triggered: The safety monitoring layer first sends a suspension request to the operating system to pause the current instance of the steering control task; a circular buffer is used to store the state snapshot sequence of the most recent N snapshot cycles, where N is a preset positive integer, for example, N is 5 in this embodiment. When a serious operational anomaly is detected in the task and the fault level reaches the threshold requiring rollback recovery, the most recent contract-verified safety snapshot is selected from the circular buffer. This snapshot saves the task program counter value, stack pointer, register context, sensor readings, and intermediate calculation state. The task's execution context is completely restored to the state corresponding to this snapshot, and then the task is rescheduled and restarted; then the task context is completely restored to the state corresponding to this snapshot; finally, the task is rescheduled and restarted to continue execution from the recovery point. Through this operation, errors caused by transient faults or accumulated state anomalies are effectively cleared.

[0038] Critical Faults: For example, if the task repeatedly violates the contract within three monitoring cycles after a rollback and restart, or if the fault is assessed as fatal, such as the steering angle sensor continuously outputting invalid data. In this case, a control algorithm degradation mode switch is triggered: the safety monitoring layer sends a degradation command to the steering control task through a reserved algorithm switching interface; upon receiving the command, the task saves the current input / output state and smoothly switches the control strategy from a high-precision model predictive control algorithm to a pre-verified simplified safety mode algorithm. This algorithm employs a rule-based conservative steering strategy, reducing computational complexity by 60% and making the control parameters more conservative; the system continues to operate in degradation mode until the root cause of the fault is eliminated or manual maintenance intervention is required.

[0039] The aforementioned multi-level recovery measures are automatically executed following a gradual principle of "from minor to severe, escalating step by step." Specifically, when the safety monitoring layer first detects a slight deviation in steering output, only an output clamping operation is performed. If the same anomaly recurs within the next three monitoring cycles, the fault level is automatically escalated to a medium fault, and a task-level state rollback and restart are performed. If the anomaly persists for more than two monitoring cycles after the rollback and restart, it is further escalated to a severe fault, triggering a switch to a control algorithm degradation mode. Through this mechanism of continuously executing multi-level recovery measures, the system can adaptively increase the recovery intensity as the fault gradually worsens, maximizing the continuous availability of control functions and minimizing the impact of the fault on vehicle driving safety.

[0040] V. Memory Protection and Timing Isolation This embodiment utilizes the MPU built into the ARM Cortex-R series processor to achieve spatiotemporal isolation between tasks. During system initialization, independent memory regions are allocated for tasks such as steering control, braking control, and sensor acquisition, and read / write execution permissions for each region are configured through the MPU. For example, the steering control task can only access its private data and code segments and cannot access the memory space of the braking control task. When the steering control task attempts to write to an unauthorized address due to a pointer error, the MPU immediately triggers a memory access violation hardware interrupt. After the safety monitoring layer captures the interrupt, it identifies the faulty task, suspends and isolates the task, records the fault context, and then performs corresponding recovery measures according to the severity of the fault.

[0041] Meanwhile, the security monitoring layer, combined with real-time scheduling strategies, sets a maximum continuous execution time threshold for each critical task. For steering control tasks, set The safety monitoring layer monitors the actual execution time of tasks through hardware timer interrupts: if the continuous execution time of the steering control task exceeds 2ms, it is determined that it violates the real-time agreement in the timing contract, and the task is actively terminated and restarted to prevent it from occupying processor resources for a long time and affecting the scheduling of other tasks.

[0042] VI. Operating Cost Control This embodiment only implements runtime contract monitoring for critical tasks at the ASILD level, such as steering control and braking control; no monitoring constraints are imposed on non-critical infotainment tasks. The contract check logic uses integer comparisons, logical judgments, and counter operations, with a time complexity of O(n log n). Actual measurements show that the average overhead of each execution of the security monitoring layer is 0.35 milliseconds, and the maximum overhead is 0.5 milliseconds, accounting for less than 5% of the total processor processing capacity, which is within the acceptable range of the system.

[0043] In an optional embodiment, an industrial robot multi-axis coordinated control system is used as an example, and reference is made to... Figure 2 , Figure 3 This describes another implementation of the method of the present invention under a heterogeneous multiprocessor architecture; the difference from Embodiment 1 lies in the implementation method of the security monitoring layer, the specific content of the runtime contract, and the triggering conditions of the recovery mechanism.

[0044] I. Coprocessor Implementation of the Security Monitoring Layer Unlike Example 1, where the safety monitoring layer is implemented as a high-priority task, in this example, the safety monitoring layer is deployed on a dedicated monitoring coprocessor, TI Hercules RM57Lx, separate from the main ARM Cortex-A series processor, with a built-in lockstep core. The main processor runs a real-time operating system and multiple robot joint control tasks, joints 1 to 6. The main processor and the monitoring coprocessor are connected via a dedicated SPI bus and share a dual-port RAM area. At the end of each control cycle, each control task on the main processor actively writes its key operating status, including joint angle setpoints, actual feedback values, torque output, and task execution timestamps, into the dual-port RAM. The monitoring coprocessor reads the data from this area at a fixed interval of 1 millisecond, achieving physical-level monitoring independence and avoiding common-cause failures between monitoring logic and business logic.

[0045] II. Extended Definition of Runtime Contracts In addition to the safety invariance conditions and timing logic constraints in Embodiment 1, this embodiment's runtime contract also includes joint cooperation constraints and energy consumption constraints. The safety invariance conditions stipulate that the torque output of each joint must not exceed 120% of the motor's rated torque; the absolute value of the angular difference between adjacent joints, such as joint 1 and joint 2, must not exceed... To prevent mechanical collisions, timing logic constraints stipulate that the total time from the main controller issuing the grasping command to the end effector completing the grasping action must not exceed 200 milliseconds; the movement of each joint must be initiated sequentially in the order of joint 1 → joint 2 → joint 3, and simultaneous initiation is not allowed.

[0046] Two versions of the algorithm were designed for each joint control task: High-performance version: a trajectory planning algorithm based on model predictive control, with the input being the joint target pose, real-time state feedback and motion constraints, and the output being high-precision trajectory and torque commands; Simplified safety version: Based on a conservative position control algorithm using PID, the input is the target safe position and position feedback, and the output is a constrained control quantity. It also has built-in triple mechanical safety constraints of torque, angle, and joint difference. Automatic clamping output is triggered when any constraint is activated.

[0047] This simplified safety version is formally verified to ensure that the output does not violate mechanical safety limits under any input; when the safety monitoring layer triggers a degradation, the task switches to this simplified algorithm without interruption.

[0048] III. Differentiated Implementation of Security Snapshots and Rollbacks Unlike the circular buffer in Example 1 that stores the most recent N snapshots, this example uses a ping-pong buffer structure to store two safety snapshots. Each snapshot, in addition to the context information described in Example 1, also stores the robot's current end effector pose matrix and grasping state. The saving of safety snapshots is triggered by the monitoring coprocessor: when the monitoring coprocessor verifies that the current task state meets the contract requirements, it sends a snapshot-ready signal to the main processor. The main processor saves the current context to the inactive buffer and then switches the buffer pointer. When a fault occurs, the monitoring coprocessor directly instructs the main processor to restore from the most recently verified safety snapshot. This design avoids the overhead of circular buffer index management and is suitable for industrial real-time control scenarios with higher deterministic requirements.

[0049] IV. Special Handling of Fault Recovery This embodiment addresses a potential "jamming" fault in robots, where the deviation between the actual joint position and the commanded position continuously increases, and incorporates a special recovery process. When the monitoring coprocessor detects that the joint position deviation exceeds the safety threshold and persists for three control cycles, it first executes output clamping to forcibly reset the torque command to zero, and then attempts to roll back to the last safe state before the fault. If the jamming persists after the rollback, it is determined to be a mechanical fault, triggering the highest level of recovery: switching to simplified safety mode, simultaneously sending an emergency stop request to the external safety controller, and recording the fault context for subsequent maintenance analysis. In this recovery process, the output clamping and rollback operations are directly controlled by the monitoring coprocessor without intervention from the main processor, further improving the real-time performance of the recovery.

[0050] V. MPU Isolation and Multi-core Timing Monitoring In this embodiment, the main processor is a quad-core Cortex-A72, with different control tasks running on each core. During system initialization, through the cooperation of ARMTrustZone and the MPU, an independent execution environment and memory protection domain are created for each task on each core. Tasks on different cores cannot directly access each other's memory areas; they can only exchange data through a shared buffer authorized by the security monitoring layer. The monitoring coprocessor is also responsible for monitoring the actual execution time of tasks on each core: when the continuous execution time of a task on a certain core exceeds a preset threshold (e.g., the threshold for joint control tasks is 1.5 milliseconds), the monitoring coprocessor directly sends a non-maskable interrupt to that core, forcing the core to pause its current task and execute a recovery process, thereby achieving cross-core time isolation.

[0051] VI. Implementation Results After adopting the solution in this embodiment, the industrial robot successfully detected and recovered from 12 transient control anomalies caused by electromagnetic interference during a continuous 72-hour anti-interference test, without any mechanical collisions or loss of control events. The introduction of the safety monitoring layer improved the system's functional safety level from SIL2 to SIL3, while monitoring overhead, with a main processor load increase of <3% and a coprocessor load increase of <15%, meeting the real-time control requirements of industrial sites.

[0052] The above two embodiments fully demonstrate the implementation of the technical solution of the present invention in different application scenarios. Those skilled in the art should understand that, without departing from the concept of the present invention, equivalent substitutions or modifications can be made to the parameters and implementation details in the above embodiments, and all such substitutions fall within the protection scope of the present invention. Those skilled in the art will recognize that the modules and method steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.

[0053] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the devices, equipment, and modules described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0054] In addition, the functional modules in the various embodiments of the present invention can be integrated into one processing module, or each module can exist physically separately, or two or more modules can be integrated into one module.

[0055] If the aforementioned functions are implemented as software functional modules and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program instructions, such as USB flash drives, portable hard drives, read-only storage servers, random access storage servers, magnetic disks, or optical disks.

[0056] Furthermore, it should be noted that the combination of the various technical features in this case is not limited to the combination methods described in the claims of this case or the combination methods described in the specific embodiments. All technical features described in this case can be freely combined or combined in any way, unless they contradict each other.

[0057] It should be noted that the above examples are merely specific embodiments of the present invention, and the present invention is obviously not limited to the above embodiments, with many similar variations. All modifications that can be directly derived or conceived by those skilled in the art from the content disclosed in this invention should fall within the protection scope of this invention.

[0058] The above are merely preferred embodiments of the present invention and are not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A method for runtime verification and security recovery of control logic in embedded real-time systems, characterized in that, Includes the following steps: A security monitoring layer is built outside the operating system kernel. The security monitoring layer is implemented as a high-priority task or a dedicated monitoring coprocessor independent of the operating system kernel. It obtains the running status data of key control tasks in the system through inter-task communication interfaces, direct access to task control block data structures, dedicated bus and dual-port RAM sharing, and performs independent and non-intrusive real-time monitoring of key control tasks. Define a machine-readable runtime contract for each critical control task. The runtime contract contains predefined safety invariant conditions and timing logic constraints that the task must satisfy. The safety monitoring layer collects the operational status data and output signals of key control tasks according to a predetermined monitoring cycle, and compares and verifies the collected data in real time with the safety invariance conditions and timing logic constraints specified in the runtime contract to determine whether the actual operation behavior of the task meets the requirements of the runtime contract. When a critical control task is detected to violate the runtime contract, the severity of the fault is assessed and classified according to the pre-defined fault classification rules. Based on the assessment results, the corresponding level of hierarchical recovery mechanism is activated. The hierarchical recovery mechanism includes output clamping, task-level state rollback and restart based on security snapshot, and switching of the control algorithm's degradation mode, in order of fault severity from mild to severe. By utilizing the built-in memory protection unit of the embedded processor, independent memory protection domains with spatiotemporal isolation are created for each critical control task. Hardware-enforced storage access permission control ensures that storage access anomalies in any task do not propagate and affect the data integrity of other tasks. Combined with real-time scheduling strategies, the execution time of each task is monitored and managed, achieving dual protection of tasks in both spatial and temporal dimensions.

2. The method for runtime verification and security recovery of control logic for embedded real-time systems according to claim 1, characterized in that, The runtime contract is described using a formal specification language. The security invariance condition specifies the allowed range of values ​​for control variables and the state transition rules. The timing logic constraints specify the time interval requirements that must be met between critical events and the event occurrence order constraints.

3. The method for runtime verification and security recovery of control logic for embedded real-time systems according to claim 2, characterized in that, The formal description of the security invariance condition includes: for any time t, the control output variable Y(t) satisfies the constraint condition. in and These represent the preset lower and upper limits of the output, respectively; the timing logic constraints include: when a fault indication event e occurs, the system must respond within a specified time limit. Complete the predetermined safety response actions within the specified time.

4. The method for runtime verification and security recovery of control logic for embedded real-time systems according to claim 1, characterized in that, The output clamping operation includes: when the safety monitoring layer detects that the output signal Y(t) of the critical control task exceeds the safety threshold range predefined by the runtime contract, it immediately performs boundary limiting processing on the output signal to forcibly constrain the output signal within the safety range; the output clamping operation is implemented through a hardware or software bypass mechanism inserted between the control task and the actuator interface, and the safety monitoring layer directly intercepts and modifies the control instructions sent to the actuator.

5. The method for runtime verification and security recovery of control logic for embedded real-time systems according to claim 1, characterized in that, The task-level state rollback and restart based on security snapshots includes: during the normal operation phase of a critical control task, the security monitoring layer or the task itself periodically saves the key state data of the task to form a security snapshot according to a preset snapshot period. The security snapshot includes the task's program counter value, stack pointer, register context, key input / output variable values, and intermediate calculation state; a circular buffer or ping-pong buffer structure is used to store the state snapshot sequence within the most recent N snapshot periods; when a serious operational anomaly is detected in the task and the fault level reaches the threshold requiring rollback and recovery, the most recent security snapshot that has passed contract verification is selected from the corresponding buffer, the task's running context is completely restored to the state corresponding to the security snapshot, and then the task is rescheduled and restarted.

6. The method for runtime verification and security recovery of control logic for embedded real-time systems according to claim 1, characterized in that, The degradation mode switching of the control algorithm includes: pre-designing a simplified safety mode algorithm as a backup control strategy for each critical control task. Compared with the original control algorithm, the simplified safety mode algorithm has lower computational complexity, more conservative control parameters, and more comprehensive formal verification coverage. When the security monitoring layer detects that a critical control task continues to violate the runtime contract after performing a rollback and restart, or when the nature of the contract violation is assessed as fatal, it sends a degradation command to the task through a reserved algorithm switching interface, triggering an online hot switch of the control algorithm. After receiving the degradation command, the control task saves its current input and output state, switches the control strategy from the original algorithm module to the simplified safety mode algorithm module, and continues to run in degradation mode.

7. The method for runtime verification and security recovery of control logic for embedded real-time systems according to claim 1, characterized in that, Creating independent memory protection domains for each task using the memory protection unit includes: during system initialization, pre-allocating an independent memory region for each critical control task and configuring the access permission attributes of this region through the memory protection unit; when any task attempts to access an unauthorized memory address during execution, the memory protection unit triggers a memory access violation hardware interrupt; after the security monitoring layer captures the interrupt signal, it identifies the task that committed the violation, isolates and suspends the task, and records the fault information; and, in conjunction with a real-time scheduling strategy, setting a maximum continuous execution time threshold for each critical task. The security monitoring layer monitors the actual execution time of each task through timer interrupts. When the continuous execution time of a task exceeds its maximum allowed threshold, it determines that the task has violated the real-time agreement in the timing contract, and actively stops and restarts the task.

8. The method for runtime verification and security recovery of control logic for embedded real-time systems according to claim 1, characterized in that, The fault classification rules include: classifying faults into three levels—minor, moderate, and severe—based on the type and severity of contract violation. A minor fault occurs when the parameter exceeds the safe range by less than 5° of a preset deviation threshold and occurs less than 3 times in the last 10 monitoring cycles. A moderate fault occurs when parameter out-of-bounds occurrences or timing violations last for more than 20 milliseconds in 5 consecutive monitoring cycles. A severe fault occurs when the contract is repeatedly violated or the fault is fatal after a task rollback and restart within 3 monitoring cycles. For minor faults, an output clamping operation is performed; for moderate faults, a task state rollback and restart are performed; and for severe faults, a control algorithm downgrade switch is performed. In accordance with the fault escalation principle, multiple levels of recovery measures are continuously executed in a short period of time until the system returns to a safe operating state.

9. The method for runtime verification and security recovery of control logic for embedded real-time systems according to claim 2, characterized in that, The verification of the timing logic constraints includes: maintaining an event timestamp record table in the security monitoring layer to record the occurrence time of each critical event; when a new critical event is detected, calculating the actual time interval between the event and its related preceding events. The time interval is compared and verified with the timing constraints specified in the runtime contract. If the actual time interval exceeds the allowed time window range, it is determined that the timing constraint is violated. For timing constraints that require verification of the order of events, a finite state automaton is used to model the expected event sequence. The security monitoring layer drives the state automaton to perform state transitions based on the actual observed event sequence. When the state machine enters the rejection state, it is determined that the event sequence constraint is violated.

10. The method for runtime verification and security recovery of control logic for embedded real-time systems according to claim 1, characterized in that, The method also includes optimized control of runtime overhead: runtime contract monitoring is implemented only for control tasks identified as critical in the system, and no monitoring constraints are imposed on non-critical or auxiliary tasks; the runtime contract inspection logic is simplified and optimized, limiting the verification operation to basic operations with constant time complexity such as threshold comparison, counter check, and logical judgment, and worst-case analysis is performed on the execution time of the security monitoring layer itself to ensure that the additional overhead it introduces is within the acceptable range of the system and that its impact on real-time scheduling stability is predictable and controllable.