Unlock instant, AI-driven research and patent intelligence for your innovation.

Cluster node fault processing method and system

A fault handling method and node technology, applied in the field of communication, can solve problems such as split-brain, failure of the cluster to provide external services, and impact on client business, so as to highlight substantive features, improve stability and scene adaptability, and avoid split-brain risk effect

Active Publication Date: 2021-08-31
SUZHOU LANGCHAO INTELLIGENT TECH CO LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the existing technology, fault handling node generation methods often produce brain splits, which in turn leads to the situation that the cluster cannot provide services to the outside world, which seriously affects the business of the client

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cluster node fault processing method and system
  • Cluster node fault processing method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0072] like figure 1 As shown, this embodiment provides a fault processing method of a cluster node, including the following steps:

[0073] S1: Add a node information database in the cluster to get information on all nodes in the cluster to store the node information database.

[0074] The data stored in the node information database includes: node service information, node information, cluster information, and fault processing node bar. The node service information includes a service name, service startup time, and service status information; the node information includes a node start time, a NPU usage of a node, and a number of clients connected to the node; the cluster information includes a node number in the cluster. , Node status information and cluster status information; the fault processing node strip is used to store the current fault processing node label.

[0075] S2: When the cluster start and client connect to the cluster node, the node information database is updat...

Embodiment 2

[0089] This embodiment also provides a fault processing method of a cluster node, including:

[0090] 1. After the node is started, the cluster adds a node information database and starts the timing event to get information on each node to store the update node information database.

[0091] Among them, the following information is saved in the node information database;

[0092] Node Service Information: Service Name, Service Start Time, Service Status Information

[0093] This node information: Node startup time, CPU usage, client connection information;

[0094] Cluster information: Number, status, cluster status of each node;

[0095] A fault processing node strip for storing a fault recovery node label.

[0096] 2. When the cluster start and client are connected to the cluster node, the above information is updated to the node information database, and determine the node health condition in accordance with the sorting algorithm, and the sort is sequentially suited to the node...

Embodiment 3

[0107] Based on the example one, such as figure 2 , The present invention also discloses a system for troubleshooting cluster nodes, comprising: a formation unit 1 database, sorting unit 2, a storage unit 3, and the node selecting unit 4.

[0108] Database construction unit 1, to add a node in the cluster information database, access to information for all nodes in the cluster node information stored in the database.

[0109] Sorting unit 2, for when the cluster is started and the client is connected to the node cluster, the node update the timing information database, and using the determined sorting algorithm to sort the data node health information database stored in the node.

[0110] Wherein the sorting unit 2 comprises:

[0111] The first scoring module, according to the score for the node state information determining a state of each node;

[0112] A second scoring module for determining a start time for each node according to the node score start time;

[0113] The third s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a cluster node fault processing method and system, and the method comprises the steps: adding a node information database in a cluster, obtaining the information of all nodes in the cluster, and storing the information in the node information database; after the cluster is started and a client is connected to a cluster node, updating the node information database regularly, and determining a node health condition sequence by using a sorting algorithm according to data stored in the node information database; determining a current fault processing node label number according to a sorting result, and storing the current fault processing node label number into the node information database; and when a node in the cluster has a fault, enabling the cluster to directly read the current fault processing node label number recorded in the node information database, and to instruct the corresponding node to perform fault recovery. According to the invention, the database is provided, the fault processing node is selected according to the information in the database, and when the service node in the cluster has the fault, the selected fault processing node carries out fault processing in time, so that the continuity of the service is ensured.

Description

Technical field [0001] The present invention relates to communication technologies, and more particularly, it relates to a method and system for troubleshooting cluster nodes. Background technique [0002] Acronym cluster computer cluster, is a computer system that connects it to a height closely together to complete a set of calculations by loosely integrated computer software (and / or) hardware. In a sense, they can be viewed as a computer. Single computer cluster node is typically referred to, typically via a LAN connection, but there are also other possible connections. Cluster computers typically used to improve the calculation speed of a single computer (and / or) reliability. Group computer than a single computer, a workstation or a supercomputer much higher cost performance under normal circumstances. [0003] Cluster is composed of multiple nodes together to provide services for the client groups, clusters In response to the occurrence of unplanned failures, once the cl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): H04L12/24H04L12/26H04L29/08
CPCH04L41/0654H04L43/0817H04L67/10
Inventor 李二明李世杰
Owner SUZHOU LANGCHAO INTELLIGENT TECH CO LTD