On-line fault tolerance method in cluster data processing system

A data processing system and cluster technology, which is applied in the field of ground remote sensing satellite data processing, can solve problems such as accelerating fault tolerance speed, and achieve the effects of wide application range, improved work efficiency, and reduced storage space

Active Publication Date: 2014-03-26
SPACE STAR TECH CO LTD
View PDF2 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

There has been a lot of research work in this area, but there are still some issues worthy of further study: first, how to further reduce the amount of data saved in checkpoints and reduce storage overhead;

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • On-line fault tolerance method in cluster data processing system
  • On-line fault tolerance method in cluster data processing system
  • On-line fault tolerance method in cluster data processing system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] Specific embodiments of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0034] Such as figure 1 As shown, an online fault tolerance method in a cluster data processing system of the present invention uses computing nodes as the smallest particle of the fault location, uses file fragments as the smallest particle of the fault checkpoint, and uses databases and high-speed storage devices to record data in the entire system , the unique state of the node, providing a method to achieve fault tolerance.

[0035] The framework of the cluster data processing system based on the present invention divides all nodes in the cluster into management nodes and computing nodes. In the processing flow, each computing link is distributed on multiple computing nodes for parallel processing, so that each computing link runs at the same time and each link is connected in series to form a task flow.

[0036] Such as figure...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an on-line fault tolerance method in a cluster data processing system. The method comprises the following steps that firstly, a last level processing node stores a processing result in a file fragmentation mode; secondly, a next level processing node reads a file fragmentation to continue to carry out processing; thirdly, a database is used for recording file fragmentation marks processed on all nodes; fourthly, when the node fault is detected, a new node is started to replace the fault node to work; fifthly, the new node reads the file fragmentation on the fault node from the database, and the fault field is recovered. The fault tolerance in the data processing process is achieved.

Description

technical field [0001] The invention relates to an online fault tolerance method in a cluster data processing system, which is mainly used for the adaptive fault tolerance of the cluster data processing system during task execution, improves system reliability, and belongs to the field of ground remote sensing satellite data processing. Background technique [0002] With the widespread use of large-scale cluster computer systems, data processing platforms are usually built based on cluster technology in the fields of aerospace, military, and scientific computing. The platform is composed of a large number of computing nodes connected with high-speed networks to achieve high-speed processing of massive data. [0003] However, the requirements for data scale, computational complexity, and business running time in the fields of aerospace, military, and scientific computing have been maintained at a high level. With the increasing number of hardware nodes and the increasingly com...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F11/20
Inventor 高越陈彦斌刘焱吴唯然孟祥国
Owner SPACE STAR TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products