A method and system for fault diagnosis of high-performance cluster

A diagnosis method and technology for a diagnosis system are applied in the field of diagnosis methods and systems for high-performance cluster faults, and can solve problems such as low troubleshooting efficiency and low troubleshooting accuracy.

Inactive Publication Date: 2019-01-08
ZHENGZHOU YUNHAI INFORMATION TECH CO LTD
View PDF3 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] This application provides a high-performance cluster fault diagnosis method and system to solve the problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for fault diagnosis of high-performance cluster
  • A method and system for fault diagnosis of high-performance cluster
  • A method and system for fault diagnosis of high-performance cluster

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0046] see figure 1 , figure 1 It is a schematic flowchart of a method for diagnosing a high-performance cluster fault provided by the embodiment of the present application. Depend on figure 1 As can be seen, the diagnostic method in this embodiment includes:

[0047] S1: Collect the basic information of each node in the high-performance cluster in the master node of the high-performance cluster. The basic information includes: node hardware information, node system log, node operating system information, node network information, master node server status information, computing node service status information, Luster file system status, and cluster management platform status information .

[0048] Specifically, the hardware information of the node includes: CPU information, board information, and network information; the operating system information of the node includes the version of the operating system; the network information of the node includes: network card equipme...

Embodiment 2

[0082] exist figure 1 and figure 2 On the basis of the illustrated embodiment see image 3 , image 3 It is a schematic structural diagram of a high-performance cluster fault diagnosis system provided by the embodiment of the present application. Depend on image 3It can be seen that the diagnostic system in this embodiment mainly includes four parts: an information collection module, a format conversion module, a display module and a fault processing module. Wherein, the information collection module is used to collect basic information of each node in the high-performance cluster in the master node of the high-performance cluster. Basic information includes: node hardware information, node system log, node operating system information, node network information, master node server status information, computing node service status information, Luster file system status, and cluster management platform status information; The format conversion module is used to convert th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present application discloses a method and a system for fault diagnosis of a high-performance cluster. The diagnosis method includes: collecting basic information of each node in the high-performance cluster in a master node of the high-performance cluster; converting the basic information into HTML document format; according to the obtained loading command, loading the basic information in the form of HTML document locally and displayed visually in the form of web interface. According to the contents displayed in the web interface, fault location and fault handling are carried out. The diagnostic system includes: an information collection module, a format conversion module, a display module and a fault processing module. The application adopts script to collect, transform and visuallydisplay the basic information of each node in the high-performance cluster, so that the user can very directly view the status of the high-performance cluster, thereby facilitating the fast positioning of the fault point, and effectively improving the accuracy and efficiency of the fault troubleshooting.

Description

technical field [0001] The present application relates to the technical field of server high-performance computing, in particular to a high-performance cluster fault diagnosis method and system. Background technique [0002] High-performance computing has become the third paradigm of scientific exploration after theoretical science and experimental science, and is widely used in many industries and fields. Among them, with the development of high-performance computing technology, small and medium-sized high-performance The application range of the cluster is getting wider and wider. It is an important issue to manage the operation and maintenance of these small and medium-sized high-performance clusters, troubleshoot them in a timely manner and complete daily maintenance. The core of the operation and maintenance management of small and medium-sized high-performance clusters is to collect information about high-performance clusters. After collecting the information of the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L12/24H04L12/26H04L29/08
CPCH04L41/0246H04L41/0677H04L43/045H04L43/0811H04L43/0817H04L67/025H04L67/1044H04L67/565
Inventor 宋辰
Owner ZHENGZHOU YUNHAI INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products