Mass computing node resource monitoring and management method for high-performance computer

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A computing node and resource monitoring technology, applied in computing, resource allocation, program control design, etc., can solve the problems of reducing system performance, increasing the load of control nodes, increasing the width of communication tree, etc., to achieve the effect of improving system performance and reducing load

Active Publication Date: 2020-11-27

NAT UNIV OF DEFENSE TECH

View PDF4 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, this will cause the system to be in a contradictory state: First, if the two upper limits are not changed, when the node size increases, although it can be guaranteed that the number of threads will not exceed the upper limit at the same time, it will make a large number of The performance of the system will be seriously damaged if the sending request enters the waiting state and cannot be processed in time

Secondly, once these two upper limits are increased, although the sending request can be guaranteed to be processed in time, the load on the control node will also increase accordingly

[0009] 2. When the size of the node increases, the load related to message forwarding on the computing node increases

[0011] To sum up, under the condition of increasing node scale, if a tree structure is used to send messages, the width of the communication tree cannot be greatly increased, otherwise it will bring greater load to both the control node and the computing node, reducing the system performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0055] First of all, this embodiment proposes an implementation method in which the entire process of processing and sending requests is independently completed by the control node. In the application scenario of large-scale nodes, it will bring various loads to the control node. The entire processing process is as follows: figure 1 Shown: as figure 1 As shown, the specific workflow is as follows:

[0056] The first step is that the control thread (agent_init thread) continuously removes a message sending request from the chain under the premise that the total number of relevant threads does not exceed the thread upper limit, and generates a worker thread (agent thread) to process the request;

[0057] The second step is data preparation for worker threads. Data preparation is mainly to determine whether the message is sent through a star structure or a tree structure, and if it is sent through a tree structure, the target node is also grouped;

[0058] The third step is for...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a mass computing node resource monitoring management method for a high-performance computer, which comprises the following steps that: enabling a control node to send a messagesending request through an intermediate node: enabling the control node to take out the message sending request and generate a working thread for processing the message sending request; selecting a normal intermediate node through the working thread; forwarding the message sending request to the selected intermediate node through the working thread, then waiting for a message returned by the intermediate node, and skipping to execute the next step after receiving the message returned by the intermediate node; and enabling the working thread to process the returned message, updating the statesof the intermediate node and the computing node, and ending the working thread. According to the invention, a layer of intermediate node is added between the control node and the massive computing nodes to share the load of the control node in the process of monitoring and managing massive computing node resources, and the related load of the computing nodes in the process is reduced at the sametime.

Description

technical field [0001] The invention relates to a high-performance computer massive computing node resource management technology, in particular to a high-performance computer-oriented massive computing node resource monitoring and management method. Background technique [0002] Currently, a management mode in which a single control node controls a large number of computing nodes is adopted for massive computing node resources in high-performance computers. During the operation of the system, the control node needs to monitor and record the real-time status of each computing node for task assignment and other work. The main way to realize this function is that the control node continuously generates a request to send messages to the computing node (message sending request), obtains the current status of the computing node according to the return message of the computing node and modifies the data structure used to manage the computing node on the control node. The common f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F9/50G06F9/54

CPCG06F9/505G06F9/542G06F9/546G06F2209/508

Inventor 戴屹钦卢凯董勇王睿伯张伟张文喆邬会军李佳鑫谢旻周恩强迟万庆陈娟

Owner NAT UNIV OF DEFENSE TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Mass computing node resource monitoring and management method for high-performance computer

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology