A communication anti-interference processing method and device for a space radiation environment, a product and a medium
By employing a multi-mode redundancy structure and a delay rewriting mechanism in the protocol timing stage tracking module in the space orbit environment, the instability problem of the communication system caused by single-event upsets was solved, ensuring the bus synchronization accuracy and the real-time efficiency of the communication system.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHANGHAI JINGJI COMM TECH CO LTD
- Filing Date
- 2026-04-30
- Publication Date
- 2026-06-19
AI Technical Summary
In the space orbit environment, single-event upsets caused by high-energy particles lead to disorder in the internal state machine logic of the CAN protocol controller. The hardware clock-driven forced rewrite mechanism of existing technology is out of sync with the timing of the upper-layer protocol, resulting in a decrease in the real-time efficiency and security of the communication system.
The communication node device adopts a multi-mode redundancy structure. By counting the cumulative number of error correction events, it switches to delayed rewriting mode when the threshold is exceeded. It also uses the protocol timing stage tracking module to generate timing stage identifiers, suspends the rewriting operation of abnormal copies, and repairs them after the bus is idle or after a frame interval, ensuring the synchronization accuracy of the bus bits.
This avoids register jump jitter during bus bit sampling or arbitration during rewriting operations under high-frequency irradiation, prevents global bus communication paralysis caused by logical errors, and achieves stability and real-time performance of communication node devices in strong irradiation environments.
Smart Images

Figure CN122248002A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of anti-interference processing, and in particular to a communication anti-interference processing method, device, product and medium for space irradiation environment. Background Technology
[0002] Controller Area Network (CAN) buses are widely used for interconnecting communication between devices inside spacecraft due to their high reliability. However, in the space orbit environment, when high-energy particles (such as protons and heavy ions) penetrate the spacecraft's shielding layer and strike the internal communication control chip, they can induce single-event upsets (SEUs) or multiple-bit upsets (MBUs), causing the internal state machine logic of the communication controller to become disordered or the cached data to change abruptly.
[0003] Currently, to address the logic corruption caused by single-event upsets, existing technologies typically employ triple modular redundancy (TMR) based on hardware state machines and forced overwrite techniques. This technology provides three physical backups for key logic modules within the CAN protocol controller, such as the arbitration state machine and bit timing state machine. In each hardware clock cycle, a voting circuit performs a majority vote on the three outputs. If an inconsistency is detected in any output, the erroneous copy is forcibly overwritten using the voting result on the next rising clock edge. Since this repair action is entirely driven by the underlying hardware clock, its timing is not constrained by the current timing state of the upper-layer CAN communication protocol. The repair operation is performed in the same manner at different timing nodes, such as during bus idle periods and message arbitration segments.
[0004] In dynamic space radiation environments (such as spacecraft traversing radiation belts or encountering solar flares), the increased frequency of particle impacts causes the underlying hardware to frequently trigger the aforementioned forced rewrite operation. Because this repair mechanism is disconnected from the timing of the upper-layer protocol, when forced rewrite occurs during CAN protocol bit sampling or bus contention arbitration, the latency and jitter introduced by the transitions in the underlying register states can disrupt the synchronization accuracy of the CAN node with the bus bit time, leading to misjudgments of the bus level state by the node. This, in turn, triggers error frame broadcasting and global bus retransmission, reducing the real-time efficiency and operational security of the communication system. Summary of the Invention
[0005] This application provides a communication anti-interference processing method, device, product, and medium for space irradiation environments, which can improve the real-time efficiency and operational safety of communication systems.
[0006] Firstly, this application provides a communication anti-interference processing method for space irradiation environments, applied to communication node equipment. The internal state machine of the CAN protocol controller in the communication node equipment adopts a multi-mode redundancy structure. The method includes: counting the cumulative number of error correction events resulting in voting inconsistencies in the multi-mode redundancy structure within a preset time window; when the cumulative number of error correction events exceeds a preset threshold, switching the rewrite repair triggering method of the multi-mode redundancy structure from immediate forced rewrite driven by a hardware clock to delayed rewrite based on protocol timing stage determination; and continuously acquiring the level transitions of the bus physical layer received signals through a protocol timing stage tracking module deployed independently of the CAN protocol controller, and based on... Level transitions generate the protocol timing stage identifier of the current CAN communication protocol; the protocol timing stage tracking module is a finite state machine that maintains the transition relationship between each communication stage of the CAN protocol; in the delayed rewrite mode, when an inconsistency is detected between the output replicas of the multi-mode redundancy structure, the protocol timing stage identifier currently output by the protocol timing stage tracking module is read; when there is a normal replica with consistent voting, and the protocol timing stage identifier indicates that the current frame transmission process is underway, the rewrite operation of the abnormal replica with voting errors is suspended; after the protocol timing stage identifier jumps to the frame interval or bus idle stage, the abnormal replica is rewritten and repaired using the status data of the normal replica.
[0007] By adopting the above technical solution, the communication node device counts the cumulative number of error correction events, quantifying the intensity of the current radiation environment impact. When the cumulative number exceeds the limit, it switches to a delayed rewrite mode and uses an independently deployed protocol timing phase tracking module to generate the current timing identifier. When inconsistency is detected in the replica output of the multi-mode redundancy structure, the communication node device reads this identifier. If a normal replica exists and is in the frame transmission process, the rewrite operation on the abnormal replica is suspended, waiting until the frame interval or a bus idle phase before using the status data of the normal replica for repair. This avoids register jump jitter introduced by the rewrite operation at sensitive timing nodes, prevents damage to bus bit synchronization accuracy, and improves communication stability under high-frequency radiation environments.
[0008] In conjunction with some embodiments of the first aspect, in some embodiments, after reading the protocol timing stage identifier currently output by the protocol timing stage tracking module, the method further includes: when a replica concurrency error occurs in the multi-mode redundancy structure, causing voting inconsistency, masking the bus output of the CAN protocol controller; writing the protocol timing stage identifier currently maintained by the protocol timing stage tracking module into the status register of all replicas in the multi-mode redundancy structure, so that the CAN protocol controller is restored to a protocol stage consistent with the current bus timing.
[0009] By adopting the above technical solution, when a replication concurrency error causes voting to fail, the communication node device immediately blocks the bus output of the CAN protocol controller to prevent uncontrollable erroneous logic from sending interference signals to the bus. Subsequently, the communication node device writes the timing identifier maintained by the independently deployed protocol timing stage tracking module into the status register of all replicas, forcing the internal state machine to align with the actual bus timing stage. In the extreme case where multiple flips cause the redundant voting mechanism to fail, this prevents the logical disorder of the node from causing a global communication failure on the bus, and achieves rapid resynchronization between the node's internal logic and the actual communication state of the bus.
[0010] In conjunction with some embodiments of the first aspect, in some embodiments, before writing the protocol timing phase identifier currently maintained by the protocol timing phase tracking module into the state registers of all copies in the multi-mode redundancy structure, the method further includes: acquiring the level feature sequence of the bus physical layer received signal within a preset observation period; comparing the level feature sequence with the expected signal feature corresponding to the protocol timing phase identifier; performing a write operation when the level feature sequence matches the expected signal feature; and maintaining the bus output masking state of the CAN protocol controller and continuously monitoring the bus physical layer signal until a bus idle state is detected when the bus idle state is detected, and synchronously resetting the internal state machine of the protocol timing phase tracking module and the CAN protocol controller to the initial state corresponding to the bus idle state.
[0011] By adopting the above technical solution, before performing a state write operation, the communication node device first collects the level characteristic sequence of the bus physical layer and compares it with the expected signal characteristics corresponding to the protocol timing stage identifier. When the characteristics match, the state write is performed, ensuring the reliability of the recovery reference. When the characteristics do not match, the communication node device maintains the bus output masked state, continuously monitors until the bus is idle, and then synchronously resets the tracking module and the internal state machine. This verifies the accuracy of the timing identifier currently maintained by the tracking module, avoids incorrect recovery based on erroneous identifiers, and ensures the safety and accuracy of the state machine synchronous reset operation.
[0012] In conjunction with some embodiments of the first aspect, in some embodiments, after synchronously resetting the protocol timing phase tracking module and the internal state machine of the CAN protocol controller to the initial state corresponding to bus idle, the method further includes: in the initial state, when the protocol timing phase tracking module detects a level transition corresponding to the frame start flag in the bus physical layer received signal, synchronously driving the protocol timing phase tracking module and the internal state machine of the CAN protocol controller to the frame receiving state; in the frame receiving state, tracking the complete protocol field sequence of the currently received frame, and when the frame end flag is successfully received and no error detection condition is triggered, releasing the bus output masking state of the CAN protocol controller.
[0013] By adopting the above technical solution, after the communication node device synchronously resets to its initial state, it detects the frame start transition of the bus physical layer signal to synchronize the protocol timing stage tracking module and the internal state machine to the frame receiving state. In the receiving state, the communication node device tracks the complete protocol field sequence of the current frame bit by bit, and removes the bus mask when the frame end flag is successfully received and no error condition is triggered. Utilizing the newly arrived complete data frame on the bus as a natural synchronization alignment benchmark ensures that the internal state machine achieves completely accurate alignment with the bus timing at the frame level, avoiding the risk of communication conflicts caused by nodes blindly resuming transmission when not fully synchronized.
[0014] In conjunction with some embodiments of the first aspect, in some embodiments, after suspending the overwrite operation of the abnormal copy in which the voting error occurred, the method further includes: masking the abnormal copy from the voting output logic of the multimodal redundancy structure, so that the output of the CAN protocol controller is determined by the remaining normal copies through mutual comparison; during the masking period, if the outputs of the remaining normal copies are inconsistent, it is determined that a copy concurrency error has occurred.
[0015] By adopting the above technical solution, after suspending the write operation of the abnormal copy, the communication node device shields it from the voting logic of the multi-mode redundancy structure, relying on the remaining normal copies to determine and maintain the bus output of the CAN controller through mutual comparison. Simultaneously, the communication node device continuously monitors the remaining normal copies; if inconsistencies are detected in the outputs of the remaining copies during the shielding period, a concurrent error is determined to have occurred. This prevents the abnormal copy awaiting repair from interfering with subsequent voting results during the waiting period and promptly captures new single-event upset events during degraded operation, ensuring the correctness of the node's external communication data during the delayed repair waiting period.
[0016] In conjunction with some embodiments of the first aspect, in some embodiments, after performing write-back repair on the abnormal copy using the state data of the normal copy, the method further includes: placing the repaired abnormal copy into a verification observation state, allowing it to run independently but not participate in the voting output of the multi-mode redundancy structure; during the verification observation period, comparing the output of the abnormal copy with the output of the normal copy participating in the voting cycle by cycle; when no output inconsistency is detected during the verification observation period, restoring the abnormal copy to access the voting output logic of the multi-mode redundancy structure; when output inconsistency is detected during the verification observation period, performing write-back repair on the abnormal copy again and re-entering the verification observation state.
[0017] By adopting the above technical solution, after the communication node device completes the rewriting and repair of the abnormal copy using the status data of the normal copy, it places the abnormal copy in a verification and observation state, allowing it to run independently for cycle-by-cycle consistency comparison and verification. Only when no output inconsistency is detected during the verification and observation period will the communication node device restore it to the voting logic; otherwise, it will perform another rewriting. This avoids the re-accession of copies with residual errors due to incomplete single rewriting or double flipping, preventing the spread of potential logical errors in the multi-mode redundancy structure and ensuring the overall voting reliability of the triple-mode redundancy system after repair.
[0018] In conjunction with some embodiments of the first aspect, in some embodiments, after counting the cumulative number of error correction events in which voting inconsistencies occur in the multimodal redundant structure within a preset time window, the method further includes: recording the event interval duration between two adjacent error correction events within the preset time window; when the duration of multiple consecutive event intervals shows a decreasing trend, and the current event interval duration is lower than a preset interval threshold, in the absence of a preset number threshold for the cumulative number of error correction events, the rewrite repair triggering method is switched in advance to delayed rewrite based on the protocol timing stage determination.
[0019] By adopting the above technical solution, the communication node device records the event interval between adjacent error correction events while counting the number of error correction events within the statistical time window. When multiple consecutive intervals show a decreasing trend and are below a preset threshold, the communication node device switches the rewriting mode to delayed rewriting in advance, before the cumulative number of events reaches the threshold. This extracts the dynamic trend of irradiation impact density, enabling advanced prediction and response to the deterioration of the space irradiation environment. It fills the protection lag blind spot existing in fixed-time-window statistics when encountering sudden high-energy particle bursts, and improves the adaptability of communication nodes to dynamic extreme irradiation environments.
[0020] Secondly, this application provides a communication node device, which includes a programmable logic device. The programmable logic device is configured by loading configuration data to implement the method described in the first aspect and any possible implementation of the first aspect. The programmable logic device internally instantiates a CAN protocol controller with a multi-mode redundancy structure, an independently deployed protocol timing stage tracking module, and a rewrite control module.
[0021] Thirdly, this application provides a computer-readable storage medium storing hardware configuration data, which, when loaded onto a programmable logic device, causes the programmable logic device to perform the method described in the first aspect and any possible implementation thereof.
[0022] Fourthly, this application provides a computer program product, including a computer program comprising hardware description language code or a bitstream file compiled therefrom, which, when configured on a programmable logic device, causes the programmable logic device to execute the method described in the first aspect and any possible implementation thereof.
[0023] It is understood that the communication node device provided in the second aspect, the computer-readable storage medium provided in the third aspect, and the computer program product provided in the fourth aspect are all used to execute the methods provided in the embodiments of this application. Therefore, the beneficial effects they can achieve can be referred to the beneficial effects in the corresponding methods, and will not be repeated here.
[0024] One or more technical solutions provided in the embodiments of this application have at least the following technical effects or advantages:
[0025] 1. By adopting a method that switches the rewrite repair triggering mode of the multi-mode redundancy structure to delayed rewrite based on protocol timing stage determination when the cumulative number of error correction events exceeds a preset threshold, and suspending the rewrite operation on the abnormal copy when a normal copy exists and the protocol timing stage indicator indicates that the frame transmission is in progress, the rewrite repair is performed after jumping to the frame interval or bus idle stage. Therefore, it avoids the register jump jitter introduced by the repair operation at the bus bit sampling or arbitration time under high-frequency irradiation, effectively solving the problem of reduced real-time efficiency and operational security of the communication system due to the disconnect between the rewrite mechanism and the upper-layer protocol timing in related technologies, and thus realizing the protection of the critical timing of the bus of communication node equipment in strong irradiation environment.
[0026] 2. By employing a method that masks the CAN protocol controller's bus output when a replica concurrency error occurs in the multi-mode redundancy structure, leading to voting inconsistencies, and by writing the protocol timing stage identifier currently maintained by the protocol timing stage tracking module into the status registers of all replicas, the interference of nodes with serious logic errors on the bus is isolated. Furthermore, by forcibly aligning the internal state machine through independently tracked reliable timing identifiers, the problem of controller logic confusion easily causing global bus communication paralysis when encountering multi-bit flips in related technologies is effectively solved. This enables rapid resynchronization of the internal state of the communication node device with the actual bus communication timing.
[0027] 3. By acquiring the level characteristic sequence of the bus physical layer received signal before performing the write operation, comparing it with the expected signal characteristics corresponding to the protocol timing stage identifier, and continuously monitoring the bus until it reaches an idle state for synchronous reset when there is a discrepancy, the reliability of the timing identifier of the tracking module is strictly verified before it is used for recovery. When the identifier is unreliable, the benchmark is rebuilt through a safe idle stage. This effectively solves the problem of synchronization failure caused by blindly relying on potentially erroneous identifiers for state recovery in related technologies, thereby achieving the safety and accuracy of the state machine synchronous reset operation. Attached Figure Description
[0028] Figure 1 This is a schematic diagram of a communication node device in one embodiment of this application;
[0029] Figure 2 This is a flowchart illustrating a communication anti-interference processing method for space irradiation environments in an embodiment of this application.
[0030] Figure 3 This is another flowchart illustrating a communication anti-interference processing method for space irradiation environments in an embodiment of this application;
[0031] Figure 4 This is a schematic diagram of a physical hardware architecture of a programmable logic device in an embodiment of this application. Detailed Implementation
[0032] The terminology used in the following embodiments of this application is for the purpose of describing particular embodiments only and is not intended to be limiting of this application. As used in the specification of this application, the singular expressions “a,” “an,” “the,” “the,” and “this” are intended to include the plural expressions as well, unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used in this application refers to any or all possible combinations including one or more of the listed items.
[0033] Hereinafter, the terms "first" and "second" are used for descriptive purposes only and should not be construed as implying or suggesting relative importance or implicitly indicating the number of indicated technical features. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature, and in the description of the embodiments of this application, unless otherwise stated, "multiple" means two or more.
[0034] This application provides a communication anti-interference processing method, device, product, and medium for space irradiation environments. The following is in conjunction with... Figure 1 This application introduces a communication node device provided in its embodiments.
[0035] Please see Figure 1This is a schematic diagram of a communication node device in an embodiment of this application. The communication node device mainly includes a CAN protocol controller, a rewrite control module, a protocol timing stage tracking module, and a bus transceiver.
[0036] The CAN protocol controller is the core logic unit for nodes to communicate on the CAN bus. To protect against single-event upsets caused by space radiation, its internal state machine adopts a multi-modal redundancy structure, containing multiple physically backed-up copies of the state machine (state machine copy 1 to state machine copy N) and a voter.
[0037] In this system, each state machine replica operates in parallel, and its outputs are all connected to a voter. The voter performs a majority vote on the outputs of each replica, generating a consistent controller output, and sends the output signal to the bus transceiver to drive the CAN bus. Simultaneously, when the voter detects inconsistencies in the outputs between replicas, it sends a correction event signal to the write control module. The CAN protocol controller also includes a state data memory and a control and configuration interface. The state data memory receives write trigger control signals from the write control module, which are used to perform state overwrite repair on abnormal replicas using data from normal replicas at appropriate times.
[0038] The protocol timing stage tracking module is deployed independently of the main CAN protocol controller. It is mainly used to deduce and reflect the current communication timing stage of the CAN bus in real time at the underlying hardware level. Even if the main controller has a logic error, this module can still provide an accurate timing reference.
[0039] The protocol timing phase tracking module internally comprises a level transition acquisition unit, a timing phase identification unit, and a phase tracking state machine. The level transition acquisition unit directly obtains the physical layer received signal from the bus transceiver, capturing the transition edges of the bus level; the timing phase identification unit parses this information; and the phase tracking state machine maintains the transition relationships between various communication phases of the CAN protocol (such as frame start, arbitration segment, bus idle, etc.). This module ultimately generates the current protocol timing phase identifier and outputs it to the rewrite control module.
[0040] The rewrite control module is responsible for dynamically assessing the current impact intensity of the space irradiation environment and determining the repair strategy for the abnormal state machine copy accordingly (immediate forced rewrite or delayed rewrite).
[0041] The rewrite control module internally includes an error correction event statistics unit, a threshold comparison unit, and a rewrite strategy control unit. The error correction event statistics unit receives error correction event signals from the CAN protocol controller and performs cumulative statistics within a window period; the threshold comparison unit compares the cumulative count with a preset threshold; the rewrite strategy control unit combines the comparison results with the protocol timing stage identifier from the protocol timing stage tracking module to determine the rewrite timing. If a delayed rewrite is decided upon, a rewrite trigger control signal is sent to the CAN protocol controller when it is determined that the current stage is safe (e.g., bus idle).
[0042] The bus transceiver is the interface component for node devices to interact with the external physical network. It connects bidirectionally to the CAN bus and receives the controller output from the CAN protocol controller, converts it into physical level signals, and sends them to the bus. At the same time, it converts the level signals received from the bus into physical layer receive signals of digital logic, which are then sent to the CAN protocol controller (for regular receive control) and the protocol timing stage tracking module (for independent timing deduction), respectively.
[0043] The method provided in this embodiment is described in detail below. Please refer to [link / reference]. Figure 2 This is a flowchart illustrating a communication anti-interference processing method for space irradiation environments in an embodiment of this application.
[0044] S201. Count the cumulative number of error correction events in which voting inconsistencies occur in a multimodal redundant structure within a preset time window.
[0045] This step is a fundamental monitoring process that runs continuously after the communication node equipment is powered on, spanning the entire system operation cycle, and providing real-time quantitative data on radiation intensity for subsequent determination of whether a switch to a repair strategy is needed.
[0046] Specifically, the preset time window refers to the time interval used to statistically analyze the frequency of error correction events. Its function is to aggregate randomly dispersed single-event flip events into a cumulative frequency within a unit of time, thereby distinguishing between sporadic irradiation backgrounds and continuous strong irradiation environments. The hardware counter in the communication node device, in conjunction with a timer, implements the periodic refresh of this window. The window length can be set based on the spacecraft's orbital irradiation belt traversal time and the CAN bus communication frame rate; for example, it can be set to 100ms. The multi-mode redundancy structure refers to the redundant design of multiple physical backups of key logic modules such as the arbitration state machine and bit timing state machine within the CAN protocol controller. A voting circuit performs a majority vote on the outputs of multiple copies to detect and correct logic errors caused by single-event flips.
[0047] An error correction event refers to a complete event in which the voting circuit of a multi-mode redundancy structure detects a difference between output replicas in a certain clock cycle and triggers the repair action for the erroneous replica. The cumulative number of error correction events refers to the total number of error correction events that occur within a preset time window, used to quantify the impact frequency of the current irradiation environment on the controller logic. The communication node device uses an internal hardware counter to cumulatively count error correction events within each preset time window. After the window ends, the count value is output for subsequent strategy judgment, and the counter is reset to zero to start a new round of statistics.
[0048] S202. When the cumulative number of error correction events exceeds the preset threshold, the rewrite repair triggering method of the multi-mode redundancy structure is switched from the hardware clock-driven instant forced rewrite to the delayed rewrite based on the protocol timing stage.
[0049] This step is triggered when the cumulative number of error correction events counted in step S201 exceeds a preset threshold. Specifically, the preset threshold refers to the threshold for the cumulative number of error correction events that triggers the switching of the repair strategy. Its function is to distinguish between normal irradiation background (occasional flips) and high-frequency irradiation environment (continuous strong irradiation) to determine whether to enable protocol timing-aware delay repair protection. This threshold can be comprehensively calibrated based on the spacecraft's historical orbital irradiation data and the CAN bus bit rate. For example, 5 error correction events occurring within a 100ms window can be set as the switching threshold. Hardware clock-driven instantaneous forced overwrite refers to the repair mechanism that, after the voting circuit of the multi-mode redundancy structure detects a copy inconsistency, the underlying hardware clock directly triggers the overwrite of the abnormal copy on the next rising edge of the clock, without being aware of the current timing stage of the CAN protocol. Under high-frequency irradiation, this poses a risk of introducing register transition jitter during bit sampling or arbitration. The delayed rewriting based on protocol timing stage determination refers to the repair mechanism introduced in this solution. That is, after detecting inconsistency between the copies, the current timing stage of the CAN protocol is read first, and the rewriting is delayed during sensitive communication stages such as frame transmission. The repair is then performed after entering a safe stage such as frame interval or bus idle.
[0050] When the communication node device detects that the cumulative number of error correction events exceeds the preset threshold, the rewrite repair triggering method of the multi-mode redundancy structure is switched from the hardware clock-driven instant forced rewrite to the delayed rewrite based on the protocol timing stage, so as to avoid high-frequency repair operations from interfering with the critical timing of the CAN bus.
[0051] Optionally, in some embodiments, the communication node device can also set multiple threshold levels to adopt different granular delay protection ranges under different irradiation intensity levels. For example, under medium irradiation, only the arbitration segment and bit sampling time are suspended for protection, while under high irradiation, the entire frame transmission stage is suspended, thereby achieving a more refined balance between repair time and communication protection.
[0052] S203. Through a protocol timing stage tracking module deployed independently of the CAN protocol controller, the level transitions of the bus physical layer received signals are continuously collected, and the protocol timing stage identifier of the current CAN communication protocol is generated based on the level transitions; the protocol timing stage tracking module is a finite state machine that maintains the transition relationship between each communication stage of the CAN protocol.
[0053] Step S203 runs independently and continuously after the communication node device is powered on, and is executed in parallel with steps S201 and S202. This step is independent of the main CAN protocol controller, ensuring that the current timing state of the bus can still be correctly reflected when the main controller is subjected to irradiation flip.
[0054] Specifically, the protocol timing phase tracking module is a hardware functional module deployed independently of the CAN protocol controller within the communication node device. It is specifically designed for real-time deduction of the current communication phase of the CAN bus. It is directly connected to the RX pin of the CAN bus transceiver to acquire the bus physical layer received signals. The level transition of the bus physical layer received signals refers to the switching between a dominant (low) and recessive (high) level on the RX pin, which is the fundamental characteristic information for deducing CAN protocol frame boundaries and communication phase transitions. The protocol timing phase identifier is a status mark deduced and output by the protocol timing phase tracking module based on the level transition rules. It is used to characterize the specific communication phase the CAN bus is currently in, such as frame start, arbitration segment, control segment, data segment, CRC segment, ACK segment, frame interval, bus idle, etc. It is the direct basis for determining whether a sensitive or safe phase is in S204 and S205. A finite state machine is an implementation of a protocol timing phase tracking module. Each state corresponds to a communication phase of the CAN protocol. The states are driven by level transition patterns (such as transitioning to the frame start state upon detecting a dominant falling edge, or transitioning to the bus idle state after 11 consecutive recessive levels) and internal bit counters. Phase deduction can be completed without parsing the complete frame content.
[0055] The communication node device continuously collects level transitions through the protocol timing phase tracking module, drives the finite state machine to flow between each communication phase, and outputs the protocol timing phase identifier in real time for use in delay rewriting decisions.
[0056] S204. In the delayed rewrite mode, when an inconsistency is detected between the output replicas of the multi-mode redundancy structure, the protocol timing stage identifier currently output by the protocol timing stage tracking module is read.
[0057] Step S204 involves continuous monitoring after the communication node device has switched to the delayed rewrite mode. When the voting circuit of the multi-mode redundancy structure detects inconsistency between replicas in any clock cycle, it is immediately triggered and is the entry step of the delayed rewrite process.
[0058] Specifically, the delayed rewriting method has been explained in step S202 and will not be repeated here. Inconsistency between output replicas refers to at least one replica in the multi-mode redundancy structure whose output result in the current clock cycle does not match the majority voting result, detected by the comparison logic of the voting circuit. Under the delayed rewriting method, once the multi-mode redundancy structure of the communication node device detects an inconsistency between output replicas, it immediately reads the protocol timing stage identifier currently output by the protocol timing stage tracking module. This reading action should be completed within the same control cycle in which the inconsistency event is detected, ensuring that the read protocol timing stage identifier strictly corresponds to the time of the inconsistency event, avoiding timing deviations in stage judgment due to reading delays.
[0059] Optionally, in some embodiments, the communication node device may also simultaneously record the inconsistent copy number and its register error state snapshot while reading the protocol timing phase identifier, so as to directly locate the target to be repaired during the subsequent delay waiting period, avoid repeatedly performing voting detection during the suspension period, and reduce the logical resource consumption during the repair waiting period.
[0060] S205. When there is a normal copy with consistent voting, and the protocol timing phase identifier indicates that the frame transmission process is currently underway, the rewriting operation of the abnormal copy with voting errors is suspended. After the protocol timing phase identifier jumps to the frame interval or bus idle phase, the abnormal copy is rewritten and repaired using the status data of the normal copy.
[0061] Step S205 is executed after inconsistency detection and protocol timing stage identifier reading are completed in step S204. By using the timing status of the collaborative protocol and the voting results, the rewrite and repair operation is constrained to be executed within a safe time window that does not affect communication.
[0062] Specifically, when a normal copy exists and the protocol timing stage identifier indicates that the frame is currently in the process of transmission, the communication node device suspends the rewriting operation of the abnormal copy and continuously polls the protocol timing stage identifier; once the identifier jumps to the frame interval or bus idle stage, it immediately performs rewriting and repair on the abnormal copy through the status data of the normal copy.
[0063] A normal copy with consistent voting results refers to a copy in a multi-mode redundancy structure whose output result is consistent with the majority vote. It is the data source upon which overwrite repair is performed, and its existence is a necessary prerequisite for safe suspension. If a normal copy does not exist, a higher-level fault handling procedure must be initiated. Frame transmission refers to all communication stages in the CAN communication protocol from the start of frame (SOF) to the completion of the ACK segment (including the arbitration segment, control segment, data segment, CRC segment, and ACK segment). These stages are highly sensitive to bit timing and bus level states; register state transitions will disrupt bit synchronization accuracy or cause node misjudgments.
[0064] The frame interval or bus idle phase refers to the interval between frames in the CAN protocol (a frame interval segment with at least 3 recessive levels) and the bus idle phase (a continuous period of more than 11 recessive levels). These phases do not involve bit sampling or arbitration contention and are safe time windows for performing rewrite repair.
[0065] The status data of the normal copy refers to the complete contents of all current status registers of the normal copy (including the current status register of the state machine, configuration registers, etc.). It is used to overwrite the corresponding storage units of the abnormal copy register by register, so that the abnormal copy is restored to the correct state that is completely consistent with the normal copy.
[0066] Optionally, in some embodiments, the communication node device can also set a timeout limit for the suspend waiting operation. When the waiting time exceeds the preset maximum frame transmission duration, even if the frame transmission stage is still in progress, the overwrite repair is forcibly performed to prevent the abnormal copy from accumulating a higher risk of secondary flipping due to continuous bus busyness.
[0067] Optionally, after suspending the rewrite operation of the abnormal copy in which the voting error occurred, the communication node device can also shield the abnormal copy from the voting output logic of the multi-mode redundancy structure, so that the output of the CAN protocol controller is determined by the remaining normal copies through mutual comparison; during the shielding period, if the outputs of the remaining normal copies are inconsistent, it is determined that a copy concurrency error has occurred.
[0068] During the waiting period for a security repair window, the system isolates abnormal copies from interfering with voting output and continuously monitors the remaining normal copies to ensure the reliability of communication output during the suspension period.
[0069] Specifically, the communication node device masks the abnormal replicas from the voting output logic of the multi-mode redundancy structure, allowing the output of the CAN protocol controller to be determined by the remaining normal replicas through mutual comparison. During the masking period, if the outputs of the remaining normal replicas are inconsistent, a replica concurrency error is determined to have occurred. The mutual comparison method refers to the cycle-by-cycle consistency comparison of the remaining two normal replicas after masking one of the three-mode redundancy. If the two are consistent, the output is normal, and the system maintains communication in a downgraded dual-mode mode. A replica concurrency error refers to the extreme state where the remaining normal replicas suffer from a single-event upset again during the suspension waiting period, resulting in inconsistent outputs. In this case, the system can no longer determine the correct output through comparison and should immediately trigger a soft reset of the CAN protocol controller and report a serious error to the upper-level fault management system for handling.
[0070] Optionally, after performing rewrite repair on the abnormal replica using the status data of the normal replica, the communication node device can also place the repaired abnormal replica in the verification and observation state, so that it runs independently but does not participate in the voting output of the multi-mode redundancy structure; during the verification and observation period, the output of the abnormal replica is compared with the output of the normal replica participating in the voting cycle by cycle.
[0071] If no output inconsistency is detected during the verification observation period, the abnormal copy will be restored to the voting output logic of the multi-mode redundancy structure;
[0072] If an inconsistency in the output is detected during the verification observation period, the abnormal copy is rewritten and repaired, and the system re-enters the verification observation state.
[0073] The correctness of the state of the repaired replica is dynamically verified to prevent incomplete repair or secondary flipping from causing potentially erroneous replicas to be re-accessed for voting, thus compromising the overall reliability of the multi-mode redundancy structure.
[0074] Among them, the verification observation state refers to the transitional isolation state after the copy has been repaired. It performs logical operations normally, but the output is only used for comparison reference and does not affect the system's external communication output; the cycle-by-cycle comparison refers to the consistency verification of all register outputs of the repaired copy and the normal copy bit by bit within each hardware clock cycle; the verification observation period refers to the continuous period from the completion of the rewrite repair to the determination of restored access. It can be set to pass if there are no errors for a certain number of consecutive clock cycles. The specific number of cycles should be set in combination with the system clock frequency and the acceptable recovery delay; the voting output logic for restoring access to the multi-mode redundancy structure refers to reconnecting the repaired copy that has passed the verification back to the voting circuit, so that the system is restored from the degraded dual-mode state to the complete triple-mode redundancy operation.
[0075] In this embodiment, the rewrite repair triggering method is switched to delayed rewrite based on protocol timing stage determination when the cumulative number of error correction events exceeds a preset threshold. During delayed rewrite, the rewrite operation on the abnormal copy is suspended when the frame transmission is in progress, and the rewrite repair is performed after jumping to the frame interval or bus idle stage. Therefore, register jump jitter caused by repair during sensitive communication stages is avoided. This effectively solves the problem in related technologies where the underlying hardware forced rewrite is out of sync with the upper-layer protocol timing, resulting in damage to bus synchronization accuracy and misjudgment retransmission. Thus, high reliability and high real-time efficiency of communication anti-interference processing are achieved in space irradiation environment.
[0076] The above embodiments describe the basic process of using statistical error correction events to accumulate counts and switching to delayed rewriting when the number exceeds a preset threshold. Combined with a protocol timing phase tracking module, this constrains the rewriting operation to the bus idle or frame interval phases, thereby avoiding interference with communication timing caused by forced rewriting under high-frequency radiation. In practical applications, the radiation intensity of the space orbit environment may experience drastic dynamic changes, and relying solely on the accumulated count threshold may have a protective lag. Furthermore, in extreme cases, multi-mode redundancy structures may encounter concurrent errors, leading to a complete loss of voting benchmarks.
[0077] Based on the above embodiments, the method provided in this embodiment will be described in further detail below. Please refer to... Figure 3 This is another flowchart illustrating a communication anti-interference processing method for space irradiation environments in an embodiment of this application.
[0078] S301. Count the cumulative number of error correction events in which voting inconsistencies occur in a multimodal redundant structure within a preset time window.
[0079] S302. When the cumulative number of error correction events exceeds the preset threshold, the rewrite repair triggering method of the multi-mode redundancy structure is switched from the hardware clock-driven instant forced rewrite to the delayed rewrite based on the protocol timing stage.
[0080] Steps S301 and S302 are similar to those described in the above embodiments as steps S201 and S202, and will not be repeated here.
[0081] S303. Record the event interval between two adjacent error correction events within the preset time window.
[0082] The event interval duration refers to the time difference between the occurrence times of two adjacent error correction events within a preset time window. It is used to quantify the distribution density of error correction events in the time dimension and is a fine-grained indicator for evaluating the irradiation shock rhythm. For example, if the nth error correction event occurs at time... The (n+1)th occurrence is The corresponding event interval is .
[0083] Steps S303 and S301 are executed in parallel and continuously without additional triggering conditions; they are the data source steps for trend prediction in S304. Specifically, when the communication node device registers each error correction event with the hardware counter, it synchronously records the timestamp of the event through the system clock counter, calculates the difference between the timestamp and the immediately preceding error correction event timestamp, obtains the event interval duration, and stores it in a first-in-first-out interval duration recording buffer. The buffer retains the interval durations of the most recent events in a sliding window manner. When the preset time window ends, the buffer and counter are refreshed synchronously, and a new round of recording begins.
[0084] S304. When the interval between multiple consecutive events shows a decreasing trend and the current event interval is lower than the preset interval threshold, the rewrite repair triggering method is switched to delayed rewrite based on the protocol timing stage determination in advance, provided that the cumulative number of error correction events has not reached the preset number threshold.
[0085] Among them, the decreasing trend of the interval between multiple consecutive events means that the interval values of the most recent consecutive events (e.g., three consecutive events) in the buffer of step S303 are monotonically shortened one after another, indicating that the frequency of error correction events is continuously accelerating and the intensity of radiation impact is deteriorating; the preset interval threshold is the lower limit of the event interval when it is determined that the accelerated deterioration of radiation has reached a dangerous level and the strategy needs to be switched in advance. Its function is to intercept the trend of radiation deterioration in advance when the cumulative number of error correction events has not yet triggered the preset number threshold. It can be determined by combining the single frame transmission period of CAN bus and the duration of bit timing sensitive window, for example, it can be set to 50% of the single frame transmission time.
[0086] Step S304 continues to run during the continuous updating of the interval duration buffer in step S303. It is used to deal with scenarios where the irradiance intensity rises rapidly but the cumulative number of events has not yet exceeded the threshold. It is a proactive protection step based on trend prediction. Specifically, the communication node device performs monotonicity analysis on the most recent consecutive duration values in the interval duration buffer. When the decreasing trend condition is met and the latest event interval duration is lower than the preset interval threshold, even if the cumulative number of error correction events has not exceeded the preset number threshold, the communication node device immediately switches the rewrite repair triggering method from the hardware clock-driven instant forced rewrite to the delayed rewrite based on the protocol timing stage determination, thus initiating protocol timing awareness protection in advance.
[0087] Optionally, in some embodiments, the communication node device may also dynamically lower the preset number threshold synchronously when the early switching is triggered in step S304, so that it is equal to the current cumulative number of error correction events, to ensure that the subsequent policy will not accidentally switch back to immediate forced overwrite due to the temporary drop in the cumulative number, thereby increasing the hysteresis stability of the protection state and preventing the policy from frequently switching and jittering near the threshold.
[0088] S305. Through a protocol timing stage tracking module deployed independently of the CAN protocol controller, the level transitions of the received signals at the bus physical layer are continuously collected, and a protocol timing stage identifier of the current CAN communication protocol is generated based on the level transitions; the protocol timing stage tracking module is a finite state machine that maintains the transition relationship between each communication stage of the CAN protocol.
[0089] S306. In the delayed rewrite mode, when an inconsistency is detected between the output replicas of the multi-mode redundancy structure, the protocol timing stage identifier currently output by the protocol timing stage tracking module is read.
[0090] S307. When there is a normal copy with consistent voting, and the protocol timing phase identifier indicates that the frame transmission process is currently underway, the rewrite operation of the abnormal copy with voting errors is suspended. After the protocol timing phase identifier jumps to the frame interval or bus idle phase, the abnormal copy is rewritten and repaired using the status data of the normal copy.
[0091] Steps S305 to S307 are similar to those described in steps S203 to S205 in the above embodiments, and will not be repeated here.
[0092] S308. When a copy concurrency error occurs in the multi-mode redundancy structure, leading to voting inconsistency, the bus output of the CAN protocol controller is disabled.
[0093] Bus output shielding refers to the communication node device cutting off the signal writing path from the CAN protocol controller to the bus physical layer driver circuit, causing the communication node device to stop driving any level to the CAN bus (usually by placing the bus driver in a forced recessive or high-impedance state), preventing the controller logic with an unknown state from outputting erroneous signals to the bus and interfering with the communication of other normal nodes on the bus.
[0094] Step S308 is triggered immediately when the multi-mode redundancy structure of the communication node device detects a replica concurrency error, and is the primary isolation and protection action in the replica concurrency error scenario. Unlike the optional steps following step S205, which only block a single replica from participating in voting, step S308 blocks the entire bus output of the CAN protocol controller. Specifically, when the voting circuit detects a replica concurrency error and cannot generate a reliable voting result, the communication node device blocks the bus output of the CAN protocol controller through hardware control signals within the same control cycle, causing the communication node device to exit the bus active transmission state; at the same time, the communication node device retains the ability to acquire signals received at the bus physical layer, continuously providing a data foundation for subsequent processes.
[0095] Optionally, in some embodiments, the communication node device may also send a CAN error frame (6 consecutive dominant bits) to the bus before performing bus output masking, in order to actively terminate the currently ongoing message transmission, notify other nodes on the bus to abandon the current message, and avoid communication timeouts caused by other nodes waiting for the ACK response from this communication node device.
[0096] S309. During the preset observation period, collect the level characteristic sequence of the signal received by the bus physical layer.
[0097] The preset observation period refers to the duration interval used to collect bus signals for reliability verification after the shielded bus output. Its function is to ensure that a level sequence that sufficiently covers the complete signal characteristics of the current protocol timing stage is collected. It can be determined by combining the CAN bus bit rate and the duration corresponding to the number of bits of the longest field in the current protocol stage. For example, at a bit rate of 1Mbps, it can be set to a duration of not less than the microsecond level corresponding to the number of bits of the longest field in the current protocol timing stage. The level feature sequence refers to the digital sequence of dominant / recessive levels obtained by continuous sampling at the bit clock frequency through the RX pin of the bus physical layer transceiver within the preset observation period, arranged in time. It is the measured input data compared with the expected signal characteristics in step S310.
[0098] Step S309 is executed immediately after the bus output masking is completed in step S308. Specifically, in the bus output masking state, the communication node device continuously samples the RX pin signal at a sampling frequency synchronized with the CAN bit clock, and stores the level value of each sampling point (0 for dominant and 1 for recessive) into the sampling buffer in a time sequence; after the preset observation period ends, the communication node device outputs the contents of the buffer as a level feature sequence to step S310 for comparison processing.
[0099] S310. Compare the level feature sequence with the expected signal features corresponding to the protocol timing stage identifier.
[0100] Among them, the expected signal characteristics refer to the theoretically expected level sequence pattern determined by the protocol timing stage identifier currently output by the protocol timing stage tracking module and the definition of the bus level of the stage according to the CAN protocol specification (for example, the expected signal characteristics of the bus idle stage are fully recessive levels and the frame interval stage is three consecutive recessive levels). These characteristics are pre-stored in the protocol feature library inside the communication node device and are retrieved using the protocol timing stage identifier as an index.
[0101] Step S310, executed immediately after the acquisition of the level feature sequence in step S309, is a core verification step for determining whether the protocol timing phase identifier currently maintained by the protocol timing phase tracking module is still reliable. Its result determines whether the subsequent path proceeds to S311 (consistent) or S312 (inconsistent). Specifically, the communication node device uses the protocol timing phase identifier currently output by the protocol timing phase tracking module as an index to retrieve the corresponding expected signal feature from the protocol feature library, compares it bit by bit with the level feature sequence, and outputs a "consistent" or "inconsistent" judgment result, triggering the processing branch S311 or S312 respectively.
[0102] S311. When the level characteristic sequence is consistent with the expected signal characteristics, the protocol timing stage identifier currently maintained by the protocol timing stage tracking module is written into the status register of all copies in the multi-mode redundancy structure, so that the CAN protocol controller is restored to the protocol stage consistent with the current bus timing.
[0103] Among them, the status register of all replicas refers to the register group inside each replica in the multi-mode redundancy structure used to store the current stage of the CAN protocol state machine and related context information (such as the current field bit counter, bit stuffing counter, etc.). Writing a unified protocol timing stage identifier to the status register of all replicas can force the internal state machine of all replicas to be aligned to the specified protocol stage. Restoring to the protocol stage consistent with the current bus timing means eliminating the timing offset caused by the state machine stagnation during replica concurrency errors, so that the internal logic state of the CAN protocol controller is resynchronized with the actual communication timing on the bus.
[0104] This step is executed when the comparison result in step S310 is "consistent," indicating that the protocol timing stage identifier currently maintained by the protocol timing stage tracking module accurately reflects the actual state of the bus and can serve as a reliable benchmark for restoring the state of all replicas. Specifically, the communication node device writes the protocol timing stage identifier currently maintained by the protocol timing stage tracking module (including stage codes and corresponding auxiliary fields such as the current value of the bit counter) field by field into the state register of all replicas in the multi-mode redundancy structure, forcing the internal state machines of all replicas to align to the current actual protocol stage of the bus, completing the resynchronization of the controller's internal state with the bus timing, and then directly jumps to step S314 to continue tracking the remaining protocol field sequence of the currently received frame.
[0105] S312. When the level characteristic sequence is inconsistent with the expected signal characteristics, maintain the bus output shielding state of the CAN protocol controller and continuously monitor the bus physical layer signal until the bus idle state is detected. Then, synchronously reset the protocol timing stage tracking module and the internal state machine of the CAN protocol controller to the initial state corresponding to the bus idle state.
[0106] The internal state machine of the CAN protocol controller refers to the finite state machine that implements the CAN communication protocol logic within each copy of the multi-mode redundancy structure of the CAN protocol controller. It is responsible for maintaining the current protocol communication stage and the transmission and reception context (such as the current frame field position, CRC calculation intermediate value, etc.) and is the core logic unit for the controller to execute CAN frame transmission and reception. The bus idle state, as mentioned above, refers to the state in which more than 11 bits of recessive level appear continuously on the bus. It is the state with the highest determinism and the safest state synchronization reset time. Synchronous reset to the initial state corresponding to the bus idle state means that the finite state machine of the protocol timing stage tracking module and the internal state machine of the CAN protocol controller are simultaneously cleared and reset to the reference initial state of the corresponding bus idle stage (internal bit counter is cleared and the status register is set to idle state encoding), so that the two are completely realigned at the state level.
[0107] This step is executed when the comparison result in step S310 is "inconsistent," indicating that the current state of the protocol timing phase tracking module is unreliable and cannot be directly used as a recovery benchmark. A reliable state benchmark needs to be rebuilt through synchronous reset after the bus enters an idle state. Specifically, the communication node device maintains the bus output mask unchanged and continuously accumulates the number of consecutive recessive levels on the RX pin. When the accumulated value reaches 11 bits, it is determined that the bus has entered an idle state. The finite state machine of the protocol timing phase tracking module and the internal state machine of the CAN protocol controller are immediately synchronously reset to the initial state corresponding to the bus idle state, entering the waiting frame start trigger state in S313.
[0108] Optionally, in some embodiments, the communication node device can also set a timeout counter for the process of waiting for the bus to be idle. If 11 consecutive recessive levels are not detected after a continuous timeout, it is determined that there is a persistent physical layer fault on the bus, a fault alarm is reported, and the recovery process is terminated to prevent indefinite waiting from consuming system resources.
[0109] S313. In the initial state, when the protocol timing stage tracking module detects a level change corresponding to the frame start flag in the bus physical layer received signal, it will synchronously drive the protocol timing stage tracking module and the internal state machine of the CAN protocol controller to the frame receiving state.
[0110] The initial state refers to the baseline starting state of the finite state machine of the protocol timing stage tracking module and the internal state machine of the CAN protocol controller after synchronous reset in step S312, corresponding to the bus idle phase; the level transition corresponding to the frame start flag refers to the bus level change corresponding to the start of frame (SOF) field in the CAN protocol, that is, the falling edge of the bus transitioning from recessive level to dominant level, indicating that a node on the bus has started sending a message; the frame reception state refers to the working state of the finite state machine of the protocol timing stage tracking module and the internal state machine of the CAN protocol controller, which are synchronously switched to in receive mode to track each field of the CAN data frame currently being transmitted on the bus bit by bit.
[0111] Step S313, after the communication node device completes the synchronization reset in step S312 and enters the initial state, continues to wait for triggering. It is a step to establish frame-level synchronization alignment using the arrival of the next frame. Specifically, the communication node device continuously monitors the RX pin signal in the initial state; once a level transition corresponding to the frame start flag (from a recessive transition to a dominant falling edge) is detected on the bus, the finite state machine of the protocol timing stage tracking module and the internal state machine of the CAN protocol controller are immediately synchronized and driven to the frame receiving state. Both of them synchronously track the current frame bit by bit, starting from the SOF bit, while keeping the bus output masking state unchanged.
[0112] S314. In frame receiving state, track the complete protocol field sequence of the currently received frame. When the frame end flag is successfully received and no error detection condition is triggered, release the bus output masking state of the CAN protocol controller.
[0113] The complete protocol field sequence refers to the complete sequence of all protocol fields (arbitration field, control field, data field, CRC field, ACK field, and frame end field) contained in the CAN frame from the start of frame (SOF) to the end of frame (EOF). By tracing the complete protocol field sequence field by field, the internal state machine context (such as bit counters and CRC calculation registers) can be updated synchronously, achieving precise bit-by-bit alignment with the bus timing. The frame end flag refers to the 7 consecutive recessive levels corresponding to the CAN protocol frame end (EOF) field, indicating that the transmission of the current frame is complete. Error detection conditions refer to the legality verification failure events triggered during frame reception, including CRC check errors, bit stuffing rule violations (Stuff Error), and frame format errors (Form Error).
[0114] Step S314 continues to be executed after the communication node device enters the frame receiving state (e.g., triggered by step S313) or recovers to a protocol stage consistent with the current bus timing (e.g., triggered by step S311). It is the final verification step to determine whether the internal state machine has been fully aligned with the bus timing and can safely resume bus transmission. Specifically, in the frame receiving state, the communication node device tracks the complete protocol field sequence of the current frame bit by bit, and synchronously performs CRC calculation and format verification. If the frame end flag is successfully tracked and no error detection conditions are triggered throughout the process, it is determined that the internal state machine has been fully aligned with the bus timing, and the bus output shielding state of the CAN protocol controller is immediately released, and the communication node device resumes normal transmission and reception capabilities. If any error detection conditions are triggered during frame receiving, it is determined that the frame synchronization has failed, and the process returns to step S312 to wait for the bus to be idle again, initiating a new round of synchronization reset and recovery process.
[0115] Optionally, in some embodiments, the communication node device may also send a test remote frame during the first bus idle period after the bus output shield is removed, and verify the integrity of the physical layer signal output by the communication node device by monitoring whether other nodes on the bus return a valid ACK response; if no valid ACK response is received, it will re-enter the bus output shield state and report a physical layer fault alarm to prevent the degradation of physical layer signal quality caused by cumulative irradiation damage from affecting the overall communication reliability of the bus.
[0116] In this embodiment, when the event interval of the statistical error correction event shows a decreasing trend and is lower than the preset interval threshold, it switches to delayed rewriting in advance, and blocks the bus output when a copy concurrency error occurs. It verifies the protocol timing stage identifier by comparing the collected level feature sequence with the expected signal feature, and performs state writing or synchronous reset in the bus idle state according to the comparison result. Then, it removes the shielding after aligning the complete protocol field sequence in the frame receiving state. Therefore, it realizes the advanced prediction of the irradiation deterioration trend, the safe isolation under concurrent errors, and the accurate state recovery based on the real bus timing. It effectively solves the problems of protection lag and the inability of the system to automatically recover communication due to multiple flips in the face of sudden strong irradiation in related technologies. Thus, it realizes the anti-interference robust operation and self-healing recovery of communication node equipment in the space dynamic high-frequency irradiation environment.
[0117] The following describes the programmable logic devices (such as FPGAs or CPLDs) used in the communication node devices of this application from the perspective of underlying hardware logic implementation. Please refer to [link to relevant documentation]. Figure 4 This is a schematic diagram of a physical hardware architecture of a programmable logic device in an embodiment of this application.
[0118] It should be noted that, Figure 4The programmable logic device architecture shown is merely an example and should not impose any limitations on the functionality and scope of use of the embodiments of this application.
[0119] like Figure 4 As shown, the programmable logic device mainly includes a configuration controller 401, a programmable logic array 402, on-chip memory resources 403, a clock management unit 404, and an input / output interface (I / O block) 405. The programmable logic array 402 contains a large number of programmable logic blocks (such as lookup tables, LUTs) and flip-flops (registers); the input / output interface 405 is used to connect to an external physical network (such as a CAN bus transceiver) to obtain bus physical layer signals.
[0120] Unlike the serial execution mechanism of software instructions in general-purpose processors, the method provided in this application embodiment is implemented through parallel logic of hardware gate-level circuits. Specifically, this application embodiment also provides a computer program product, which includes RTL-level code written using a hardware description language (such as Verilog HDL or VHDL), or hardware configuration data (i.e., bitstream file) generated by compiling the code after logic synthesis, placement and routing.
[0121] The aforementioned hardware configuration data is stored in an external non-volatile storage medium (such as a configuration Flash device). When the communication node device is powered on or reset, the configuration controller 401 reads the hardware configuration data from the external storage medium and loads it into the configuration SRAM inside the programmable logic device.
[0122] After the hardware configuration data is loaded, the logic gate connections within the programmable logic array 402 and the on-chip storage resources 403 are configured accordingly, thereby instantiating the various functional hardware modules described in the preceding embodiments on the physical circuit. These include: a CAN protocol controller employing a multi-mode redundancy structure, an independently deployed protocol timing stage tracking module, and a rewrite control module. At this time, the clock management unit 404 provides the underlying hardware clock drive, enabling the instantiated hardware modules to operate in parallel and execute the above-described functions in a purely hardware logic manner. Figure 2 or Figure 3 The various steps in the communication anti-interference processing method for space irradiation environment described in the embodiment include, for example, hardware-level multi-mode redundancy voting, delayed rewriting of register data, and state transitions of finite state machines.
[0123] In another aspect, this application also provides a computer-readable storage medium, which may be the aforementioned non-volatile configuration memory or a portable storage medium (such as a USB flash drive, optical disc, etc.) for burning or distributing the hardware configuration data. The storage medium stores hardware configuration data (bitstream file), and when the hardware configuration data is loaded into a programmable logic device, the programmable logic device executes a communication anti-interference processing method for space irradiation environments provided in the above embodiments of this application.
[0124] The above-described embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit it. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of this application.
[0125] As used in the above embodiments, depending on the context, the term "when..." can be interpreted as meaning "if...", "after...", "in response to determining...", or "in response to detecting...". Similarly, depending on the context, the phrase "when determining..." or "if (the stated condition or event) is interpreted as meaning "if determining...", "in response to determining...", "when (the stated condition or event) is detected", or "in response to detecting (the stated condition or event)".
[0126] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. This program can be stored in a computer-readable storage medium, and when executed, it can include the processes described in the above method embodiments. The aforementioned storage medium includes various media capable of storing program code, such as ROM or random access memory (RAM), magnetic disks, or optical disks.
Claims
1. A communication anti-interference processing method for space irradiation environments, applied to communication node equipment, characterized in that, The internal state machine of the CAN protocol controller in the communication node device adopts a multi-mode redundancy structure, and the method includes: The cumulative number of error correction events resulting in voting inconsistencies in the multimodal redundancy structure within a preset time window is counted. When the cumulative number of error correction events exceeds a preset threshold, the rewrite repair triggering method of the multi-mode redundancy structure will be switched from hardware clock-driven instant forced rewrite to delayed rewrite based on protocol timing stage determination. The protocol timing stage tracking module, deployed independently of the CAN protocol controller, continuously collects the level transitions of the bus physical layer received signals and generates a protocol timing stage identifier for the current CAN communication protocol based on the level transitions; the protocol timing stage tracking module is a finite state machine that maintains the transition relationships between the various communication stages of the CAN protocol. In the delayed rewriting mode, when an inconsistency is detected between the output replicas of the multi-mode redundancy structure, the protocol timing stage identifier currently output by the protocol timing stage tracking module is read. When there is a normal copy with consistent voting, and the protocol timing phase identifier indicates that the frame transmission process is currently underway, the rewriting operation of the abnormal copy with voting errors is suspended. After the protocol timing phase identifier jumps to the frame interval or bus idle phase, the abnormal copy is rewritten and repaired using the status data of the normal copy.
2. The method according to claim 1, characterized in that, After reading the protocol timing phase identifier currently output by the protocol timing phase tracking module, the method further includes: When a replica concurrency error occurs in the multi-mode redundancy structure, leading to voting inconsistency, the bus output of the CAN protocol controller is disabled. The protocol timing stage identifier currently maintained by the protocol timing stage tracking module is written into the status register of all replicas in the multi-mode redundancy structure, so that the CAN protocol controller is restored to the protocol stage consistent with the current bus timing.
3. The method according to claim 2, characterized in that, Before writing the protocol timing phase identifier currently maintained by the protocol timing phase tracking module into the status registers of all replicas in the multi-mode redundancy structure, the method further includes: Within a preset observation period, the level characteristic sequence of the signals received by the bus physical layer is collected; The level feature sequence is compared with the expected signal features corresponding to the protocol timing stage identifier; The write operation is performed when the level feature sequence matches the expected signal feature. When the level characteristic sequence is inconsistent with the expected signal characteristics, the bus output shielding state of the CAN protocol controller is maintained, and the bus physical layer signal is continuously monitored until the bus idle state is detected. The protocol timing stage tracking module and the internal state machine of the CAN protocol controller are synchronously reset to the initial state corresponding to the bus idle state.
4. The method according to claim 3, characterized in that, After synchronizing and resetting the protocol timing phase tracking module and the internal state machine of the CAN protocol controller to the initial state corresponding to bus idle, the method further includes: In the initial state, when the protocol timing phase tracking module detects a level transition corresponding to the frame start flag in the bus physical layer received signal, it synchronizes and drives the internal state machine of the protocol timing phase tracking module and the CAN protocol controller to the frame receiving state. In the frame receiving state, the complete protocol field sequence of the currently received frame is tracked. When the frame end flag is successfully received and no error detection condition is triggered, the bus output masking state of the CAN protocol controller is released.
5. The method according to claim 1, characterized in that, After suspending the overwrite operation of the abnormal copy of the voting error, the method further includes: The abnormal copy is shielded from the voting output logic of the multi-mode redundancy structure, so that the output of the CAN protocol controller is determined by the remaining normal copies through mutual comparison. During the shielding period, if inconsistencies occur in the output among the remaining normal replicas, a replica concurrency error is determined to have occurred.
6. The method according to claim 5, characterized in that, After performing the overwrite repair on the abnormal copy using the state data of the normal copy, the method further includes: The repaired abnormal copy is placed in the verification observation state, allowing it to run independently but not participate in the voting output of the multimodal redundancy structure; During the verification observation period, the output of the abnormal copy is compared with the output of the normal copy participating in the voting on a cycle-by-cycle basis. If no output inconsistency is detected during the verification observation period, the abnormal copy will be restored to the voting output logic of the multi-mode redundancy structure; If an output inconsistency is detected during the verification observation period, the abnormal copy is rewritten and repaired, and the verification observation state is re-entered.
7. The method according to claim 1, characterized in that, After accumulating the number of error correction events involving voting inconsistencies in the multimodal redundancy structure within the preset statistical time window, the method further includes: Record the duration of the event interval between two consecutive error correction events within the preset time window; When the duration of multiple consecutive event intervals shows a decreasing trend, and the current event interval duration is lower than the preset interval threshold, the rewrite repair triggering method is switched to delayed rewrite based on protocol timing stage determination in advance, provided that the cumulative number of error correction events has not reached the preset number threshold.
8. A communication node device, characterized in that, The communication node device includes a programmable logic device, which is configured to implement the method as described in any one of claims 1-7 by loading configuration data; the programmable logic device internally instantiates a CAN protocol controller with a multi-mode redundancy structure, an independently deployed protocol timing stage tracking module, and a rewrite control module.
9. A computer-readable storage medium storing hardware configuration data, characterized in that, When the hardware configuration data is loaded into the programmable logic device, the programmable logic device performs the method as described in any one of claims 1-7.
10. A computer program product, comprising a computer program, characterized in that, The computer program includes hardware description language code or a bitstream file compiled therefrom, and when the computer program is configured on a programmable logic device, the programmable logic device performs the method as described in any one of claims 1-7.