Unlock instant, AI-driven research and patent intelligence for your innovation.

A Checkpoint Placement Method Based on Node Failure Correlation

A checkpoint and correlation technology, applied in the direction of complex mathematical operations, informatics, instruments, etc., can solve problems that have not been analyzed very effectively and apply fault-tolerant technology, so as to achieve reasonable placement strategies, optimize placement locations, and reduce completion the effect of time

Active Publication Date: 2018-08-07
南京臻融科技有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This information is often used for system back-testing and failure analysis, and has not been analyzed very effectively and applied to fault-tolerant techniques

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Checkpoint Placement Method Based on Node Failure Correlation
  • A Checkpoint Placement Method Based on Node Failure Correlation
  • A Checkpoint Placement Method Based on Node Failure Correlation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] Below in conjunction with accompanying drawing and specific embodiment, further illustrate the present invention, should be understood that these embodiments are only for illustrating the present invention and are not intended to limit the scope of the present invention, after having read the present invention, those skilled in the art will understand various aspects of the present invention Modifications in equivalent forms all fall within the scope defined by the appended claims of this application.

[0028] In the present invention, the checkpoint placement method based on node failure correlation (CPNFC, Chekpoint Placement based on Node Failure Correlation) executes a checkpoint process before a node fails, which can effectively shield the correlation failure. The correlation between nodes means: if at node P 1 A failure event occurs on , then with node P 1 Associated nodes will soon fail as well.

[0029] The following first introduces the model of the CPNFC met...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a node failure relevance-based check point placing method. The method comprises the following steps of: firstly establishing a system waste time expression function, obtaining failure data in a system, obtaining the relevance between any two nodes by utilizing the relevance between all the failure events between the nodes, and grouping the nodes according to the relationship between the relevance between the nodes and a preset threshold value, so as to obtain a node relevance group; carrying out event grouping on the events of nodes in the node relevance group according to the relevance between the events, so as to obtain relevance group events, and further obtaining system events; and finally calculating the periods of check points according to the probability distribution of time intervals of the failure events. According to the method disclosed by the invention, the occurrence time of the failure events is predicted through analyzing the relevance between the failures of the nodes, and the waste time of distributed systems is shortened through a check point strategy driven by the failure events.

Description

technical field [0001] The invention belongs to computer distributed computing, relates to distributed system fault-tolerant computing and checkpoint technology, in particular to a method for placing checkpoints based on node failure correlation. Background technique [0002] In order to meet the demand for computing power required for scientific computing or simulation of large-scale problems, computer systems have developed towards massively parallel platforms in the past decade. As the scale of distributed systems increases and the interaction between components of distributed systems become more complex, and the failure frequency of distributed systems increases significantly. For large-scale scientific computing or simulation, especially in the era of big data, people need to mine information from massive amounts of data, and the mining algorithm calculation may need to last for days or even months on the computing platform. Frequent failures lead to the loss of previo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/11
CPCG16Z99/00
Inventor 黄志勇汪芸
Owner 南京臻融科技有限公司