Optimization method of application-level multilayer check points based on super computer

A technology of supercomputer and optimization method, which is applied in the direction of data error detection, calculation, multi-programming device, etc., which can solve the problems of lack of versatility, avoid global communication overhead, and reduce time overhead. , Avoid the effect of shared memory data failure

Active Publication Date: 2020-04-10
XI AN JIAOTONG UNIV
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to overcome the disadvantage that the existing multi-layer checkpoint optimization method depends on the hardware characteristics of the supercomputer, which often does not have universality, and provides a supercomputer-based application-level multi-layer checkpoint optimization method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Optimization method of application-level multilayer check points based on super computer

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] In order to enable those skilled in the art to better understand the solutions of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is an embodiment of a part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

[0053] It should be noted that the terms "first" and "second" in the description and claims of the present invention and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for optimizing application-level multilayer check points based on a super computer, and belongs to the field of computer system structures and high-performance computing. The invention discloses an optimization method for application-level multilayer check points based on a super computer. The optimization method comprises the following steps: 1) replacing fault nodes; 2) determining a placement time sequence of each layer of check points; 3) performing group division on hardware nodes and processes running on the nodes; 4) determining the type of the fault occurring in the process group by adopting a fault type judgment algorithm; 5) for the fault type, adopting a corresponding recovery strategy in the process group; 6) when a corresponding time point in each layer of check point placement time sequence is reached, adopting a corresponding check point strategy to complete the storage of the intermediate state data; and (8) continuing to operate, if theoperation is not finished, returning to the step (6), otherwise, finishing the placement of the check points. The defect that the existing multi-layer check point optimization method depends on the hardware characteristics of the super computer, so that the super computer often does not have universality is overcome.

Description

technical field [0001] The invention belongs to the field of computer system structure and high-performance computing, in particular to an optimization method for supercomputer-based application-level multi-layer checkpoints. Background technique [0002] Checkpoint is the most typical fault-tolerant technology in large-scale scientific computing programs. Its core idea is to save the latest running state of the program in the checkpoint, and restore the program to the latest running state by reading the checkpoint data when a failure occurs. The traditional checkpoint technology achieves fault tolerance by periodically saving intermediate state data to the global file system. However, as the scale of high-performance computer systems becomes larger and larger, the mean time between failures (meantime between failure) that the system can provide gradually decreases. Shorten, checkpointing technology faces serious performance problems. [0003] According to the probability o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/14G06F9/54
CPCG06F11/1458G06F9/544Y02D10/00
Inventor 张兴军周剑锋董小社李靖波鲁晨欣张楚华
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products