Resource monitoring method and device for artificial intelligence server

An artificial intelligence and resource monitoring technology, which is applied in hardware monitoring, instruments, energy-saving computing, etc., can solve the problems of not being able to meet the use requirements, insufficient performance analysis statistical information, and unable to display the task manager interface, etc., to achieve automatic resource monitoring , the effect of solving system problems

Inactive Publication Date: 2020-10-30
SUZHOU LANGCHAO INTELLIGENT TECH CO LTD
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But for the AI ​​server, this task manager cannot meet the usage requirements
First of all, as mentioned above, AI servers often do not have a graphical interface and cannot display the task manager interface; secondly, the significance of AI servers needing such tools is to help administrators or users perform performance analysis, not only to capture CPU, memory, For the utilization rate of these components of the disk, IO data such as the utilization rate of the computing accelerator, the communication bandwidth between the motherboard and the computing board, and the communication bandwidth between the accelerators should be captured, and factors such as comprehensive computing and IO should be analyzed.
[0004] There is currently no effective solution to the problems in the prior art that the AI ​​server cannot be displayed graphically and the statistical information required for performance analysis is insufficient

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Resource monitoring method and device for artificial intelligence server
  • Resource monitoring method and device for artificial intelligence server

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] In order to make the object, technical solution and advantages of the present invention clearer, the embodiments of the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0033] It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are to distinguish two entities with the same name but different parameters or parameters that are not the same, see "first" and "second" It is only for the convenience of expression, and should not be construed as a limitation on the embodiments of the present invention, which will not be described one by one in the subsequent embodiments.

[0034] Based on the above purpose, the first aspect of the embodiments of the present invention proposes an embodiment of a resource monitoring method that can support graphical display and provide sufficient statistical information to perform perfor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a resource monitoring method and device for an artificial intelligence server, and the method comprises the steps: obtaining a first operation state of a predefined target process and a second operation state of a GPU application program, and judging whether the target process exists or not according to the first operation state and the second operation state; in response to the existence of the target process, using a performance monitoring counter to automatically collect feature information from the server, wherein the feature information comprises a real-time communication link transmission bandwidth, a control equipment working state, a GPU working state, equipment temperature and equipment power consumption; formatting the feature information into a format stored in a database to be written into the database; and constructing a webpage, periodically reading the feature information from the database by using javascript, and filling the feature information into the webpage in a covering manner so as to visually display and refresh the feature information. According to the invention, graphical display can be supported, sufficient statistical information can be provided to execute performance analysis, automatic resource monitoring is realized, and system problems are solved.

Description

technical field [0001] The present invention relates to the field of monitoring, and more specifically, to a resource monitoring method and device for an artificial intelligence server. Background technique [0002] AI (artificial intelligence) server is the computing carrier of artificial intelligence model training and reasoning, and plays an important role in the development of artificial intelligence today. Compared with general-purpose servers, AI servers pay more attention to the pursuit of computing performance, because this type of server is mainly used for computing data, and it is a heterogeneous computing system. In addition to the general-purpose central processing unit CPU, the AI ​​server also has GPU (graphics processing unit), ASIC (application-specific integrated circuit) accelerator card, FPGA (field programmable gate array), etc., which are specially designed for large-scale parallel matrix operations. Designed computing accelerators; the CPU is only resp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/34G06F11/30
CPCG06F11/3055G06F11/3058G06F11/3466Y02D10/00
Inventor 李磊王月
Owner SUZHOU LANGCHAO INTELLIGENT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products