Fast parallel recomputing method aiming at data parallel computing fault tolerance

A parallel computing and data-oriented technology, applied in the direction of response errors, etc., can solve problems such as waste of computing resources, reduction of parallel efficiency, etc., and achieve the effect of improving efficiency and processing efficiency

Active Publication Date: 2015-11-04
苏州徽之源人力科技有限公司
View PDF1 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The current mainstream software fault-tolerant strategy is oriented to time redundancy, which leads to the failure of the node to restore the task again, because its recovery time is greater than the time interval between the previous checkpoint and the moment of failure, resulting in tasks with dependencies Waiting for a long time, and these problems lead to reduced parallel efficiency and waste of computing resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fast parallel recomputing method aiming at data parallel computing fault tolerance
  • Fast parallel recomputing method aiming at data parallel computing fault tolerance
  • Fast parallel recomputing method aiming at data parallel computing fault tolerance

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0064] Embodiments of the present invention provide a parallel computing fast recalculation method based on a redundant computing mechanism, such as Figure 6 shown, including the following steps:

[0065] Step 101: Reading in data and distributing data. First, the process of the master node reads in the data, and starts the corresponding thread according to the data division strategy. Then, each thread distributes each data block to two slave node processes according to the secondary redundancy calculation strategy, and performs secondary redundancy calculation on the slope algorithm in digital terrain analysis. Finally, each thread waits for the result of the computation.

[0066] Step 102: Redundancy calculation. Each slave node process initiates calculation after receiving the data block, and each data block is logically divided into 4 logical sub-blocks. After the calculation of a logical sub-block is completed, the process sends the result of the logical sub-block to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of parallel system fault tolerance, relates to a parallel recomputing method using redundancy computation for error detection and error correction on a computing task, and particularly provides a fast parallel recomputing method for logic division and secondary scheduling on the basis of a data block corresponding to an error task. The method comprises the following steps of: performing error detection based on a redundant computing strategy on the computing result of the data block; and performing recomputing on the basis of multi-thread error detection and process recomputing error correction. The method provided by the invention can be completely applied to fault-tolerance processing occasions of high-performance calculation of parallel digital terrain analysis of large-scale mass data, such as the terrain factor extraction of rule grid parallel interpolation, slope and aspect parallel computing, depression filling and leveling-up parallel computing and the like; the method can be applied to high-performance computing of geographic information processing, and can also be applied to the application occasions of space decision analysis based on geographic information, data mining and the like; and the processing efficiency is improved.

Description

technical field [0001] The invention belongs to the technical field of fault tolerance of parallel computing systems, relates to using redundant strategies to quickly correct errors in computing tasks, and in particular proposes a fast parallel recalculation method for data parallel computing fault tolerance. Background technique [0002] Fault-tolerant processing of parallel computer systems is a problem that cannot be ignored. A parallel system is fault-tolerant, which means that its program can still run correctly and guarantee correct results even in the event of logical failures. [0003] In recent years, with the increasing complexity of computer system structure, the development of semiconductor manufacturing process, the reduction of line width and the improvement of integration, from user desktop systems to distributed computing environments, and even large-scale parallel computer systems, power consumption and reliability Sexual issues are becoming increasingly pr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/07
Inventor 窦万峰苗守帅
Owner 苏州徽之源人力科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products