Fault handling method and system for a cluster node

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A fault handling method and node technology, applied in the field of communications, can solve problems such as split-brain, the inability of the cluster to provide services to the outside world, and the impact of client business, so as to highlight substantive features, improve stability and scene adaptability, and avoid split-brain effect of risk

Active Publication Date: 2021-10-22

INSPUR SUZHOU INTELLIGENT TECH CO LTD

View PDF3 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In the existing technology, fault handling node generation methods often produce brain splits, which in turn leads to the situation that the cluster cannot provide services to the outside world, which seriously affects the business of the client

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0072] like figure 1 As shown, this embodiment provides a method for troubleshooting cluster nodes, including the following steps:

[0073] S1: Add a node information database to the cluster, obtain the information of all nodes in the cluster and store it in the node information database.

[0074] Wherein, the data stored in the node information database includes: node service information, node information, cluster information and fault handling node entries. The node service information includes service name, service start time and service status information; the node information includes node start time, node CPU usage and the number of clients connected to the node; the cluster information includes the number of nodes in the cluster , node state information and cluster state information; the fault processing node bar is used to store the current fault processing node label.

[0075] S2: After the cluster is started and the client is connected to the cluster nodes, the nod...

Embodiment 2

[0089] This embodiment also provides a fault handling method for cluster nodes, including:

[0090] 1. After the node is started, the cluster adds a new node information database, and starts a timing event to obtain the information of each node and store it in the updated node information database.

[0091] Among them, the following information is stored in the node information database;

[0092] Node service information: service name, service start time, service status information;

[0093] Information about this node: node startup time, CPU usage, client connection number information;

[0094] Cluster information: number of nodes, status of each node, cluster status;

[0095] A failover node entry for storing the failover node label.

[0096] 2. After the cluster is started and the client is connected to the cluster nodes, the above information is regularly updated to the node information database, and the health status of the nodes is determined according to the sorting ...

Embodiment 3

[0107] Based on Example 1, such as figure 2 As shown, the present invention also discloses a fault handling system for cluster nodes, including: a database building unit 1 , a sorting unit 2 , a storage unit 3 and a node selection unit 4 .

[0108] The database building unit 1 is used to add a node information database in the cluster, obtain information of all nodes in the cluster and store it in the node information database.

[0109] The sorting unit 2 is used to regularly update the node information database after the cluster is started and the client is connected to the cluster nodes, and determine the ranking of the health status of the nodes according to the data stored in the node information database using a sorting algorithm.

[0110] Wherein, the sorting unit 2 specifically includes:

[0111] The first scoring module is used to determine the state score of each node according to the node state information;

[0112] The second scoring module is used to determine th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

A fault handling method and system for a cluster node proposed by the present invention, the method includes: adding a node information database in the cluster, obtaining information of all nodes in the cluster and storing them in the node information database; when the cluster starts and the client connects to After the nodes are clustered, update the node information database regularly, and use the sorting algorithm to determine the order of the health status of the nodes according to the data stored in the node information database; determine the current fault handling node label according to the sorting results, and store it in the node information database; After a node fails, the cluster directly reads the current failure-handling node label recorded in the node information database, and notifies the corresponding node to recover from the failure. The present invention provides a database, selects a fault processing node according to the information in the database, and when a fault occurs in a service node in the cluster, the selected fault processing node performs fault processing in time, thereby ensuring service continuity.

Description

technical field [0001] The present invention relates to the technical field of communication, and more specifically relates to a fault handling method and system for a cluster node. Background technique [0002] A computer cluster, referred to as a cluster for short, is a computer system that is connected through a group of loosely integrated computer software (and / or) hardware and highly closely cooperates to complete computing work. In a sense, they can be seen as a computer. The individual computers in a cluster system are usually called nodes and are usually connected by a local area network, but there are other possible connections. Cluster computers are often used to improve the computing speed (and / or) reliability of individual computers. In general, cluster computers are much more cost-effective than individual computers, workstations or supercomputers. [0003] A cluster is composed of multiple nodes that jointly provide services for clients. In order to cope wit...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): H04L12/24H04L12/26H04L29/08

CPCH04L41/0654H04L43/0817H04L67/10

Inventor 李二明李世杰

Owner INSPUR SUZHOU INTELLIGENT TECH CO LTD

Fault handling method and system for a cluster node

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology