Soft error real-time detection and recovery method and system for online parallel processing

A technology of parallel processing and recovery method, which is applied in the direction of data error detection, error detection/correction, redundant code error detection, etc., in the direction of redundancy in operation, and can solve problems such as hidden equipment failures or abnormal protection and control functions. , to achieve the effect of improving robustness, stability and powerful functions

Active Publication Date: 2020-12-08
NARI TECH CO LTD +1
View PDF3 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] The purpose of the present invention is to overcome the deficiencies in the prior art, and provide an online parallel processing soft error real-time error detection and recovery method and sy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Soft error real-time detection and recovery method and system for online parallel processing
  • Soft error real-time detection and recovery method and system for online parallel processing
  • Soft error real-time detection and recovery method and system for online parallel processing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0086] A soft error real-time error detection and recovery method for online parallel processing of the present invention includes the following processes:

[0087] Divide the protected RAM space into multiple protected areas;

[0088] Divide all protected areas into one or more levels, the highest level is to complete an error detection and recovery function for each interruption cycle; other levels are to complete an error detection and recovery function for multiple interruption cycles;

[0089] The protected areas of each level are registered to generate a linked list corresponding to the number of levels, and the linked list is located in the protected RAM space; at least two copies of A and B are backed up in each linked list and the protected area in the linked list to other RAM spaces; the content of the linked list includes The position and length of each protected area of ​​the same level and the positions of A and B backups;

[0090] Parallel processing of error de...

Embodiment 2

[0095]In order to realize the parallel data processing function of parallel processing modules (such as FPGA processing modules, subsequently collectively referred to as FPGA parallel processing modules), the FPGA parallel processing modules in the system need to access the RAM of the CPU system through high-speed interfaces, while the high-speed interfaces PCIe, SRIO, and HyperLink Wait. At the same time, in order to minimize the impact on the CPU system when the FPGA parallel processing module assists in processing, the FPGA parallel processing module needs to separately control the memory storage space for backup data, such as a separate DDR, SRAM or internal RAM of the FPGA parallel processing module. If the RAM space of the CPU processor is very sufficient, and the high-speed interface bandwidth is relatively sufficient, a separate space can also be divided in the CPU processor RAM for the FPGA parallel processing module to store backup data.

[0096] On the hardware modu...

Embodiment 3

[0186] The present invention also provides an online parallel processing soft error real-time error detection and recovery device, including a linked list management module and an error detection and recovery module, wherein:

[0187] Linked list management module, used to divide the protected RAM space into multiple protected areas; divide all protected areas into one or more levels, the highest level is to complete an error detection and recovery function for each interrupt cycle; other levels are multiple Complete an error detection and recovery function in one interrupt cycle; register the protected areas of each level to generate a linked list corresponding to the number of levels, and the linked list is located in the protected RAM space; back up at least two copies of each linked list and the protected area in the linked list to other RAM space; the linked list content includes the positions of each protected area of ​​the same level, the length and the position of each ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a soft error real-time error detection and recovery method and system for online parallel processing. The method comprises the following steps: dividing a protected RAM space into a plurality of protected areas; dividing all protected areas into one or more levels, wherein the highest level is to complete one error detection and recovery function for each interruption period, wherein other levels are multiple interrupt periods to complete one-time error detection and recovery function; registering the protected area of each level to generate a linked list correspondingto the number of levels, and backing up at least two parts of each linked list and the protected area in the linked list to other RAM spaces; and performing parallel processing on each level of the linked list and error detection and recovery of each protected area in the linked list. According to the method, high-importance-level data can be verified, judged, corrected and recovered in a single interrupt beat of a control system in a key scene, and the method does not depend on a CPU processor, and can perform real-time parallel processing; and the method can achieve the online real-time error detection and correction function when multiple positions of the RAM memory of the CPU are abnormally shifted at the same time.

Description

technical field [0001] The invention belongs to the technical field of error detection for abnormal displacement of RAM memory, and in particular relates to a soft error real-time error detection and recovery method and system for online parallel processing. Background technique [0002] With the development of microprocessor technology in the direction of low power consumption, low voltage, and high integration, the impact of abnormal displacement of RAM (Random Access Memory) memory (or soft error, or single event effect) on system security and stability Can not be ignored. The main causes of abnormal RAM are: (1) alpha particle radiation. The α-particle radiation in the packaging material used by the processor will cause abnormal displacement of the chip storage area; (2) The scale of the integrated circuit increases. The size of the transistor is getting smaller and the frequency is getting higher and higher, but the limit voltage of the transistor is getting lower and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G11C29/44G06F11/10G06F11/14
CPCG11C29/44G06F11/1004G06F11/1469G06F11/106G06F2201/83G06F11/1448G11C2029/0411G11C2029/0409G11C29/52G11C29/24G11C29/74G06F11/004
Inventor 周华良郑玉平徐广辉李友军刘拯邹志杨姜雷高诗航汪世平张家森
Owner NARI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products