[0016] see figure 1 , the agent-based grid resource monitoring system of the present invention is structurally divided into three levels, which are respectively the global information layer, the domain agent layer, and the node information collection layer, which include: sensors arranged at each node end, in the domain The sensor manager of the proxy layer, and the information center server.
[0017] In this embodiment, the grid resource monitoring system includes a sensor 11 arranged at the node end 11, a sensor 12 arranged at the node end 12, a sensor 13 arranged at the node end 13, and a sensor 21 arranged at the node end 21 , the sensor 22 set at the node end 22, and the sensor 23 set at the node end 23, each sensor is used to collect the system resource information of the host computer at the respective node end respectively. For example, the sensor 11 is used to collect the system resource information of the host 11 at the node end 11; the sensor 12 is used to collect the system resource information of the host 12 at the node end 12; the sensor 13 is used to collect the system resource information of the host 13 at the node end 13; The sensor 21 is used to collect the system resource information of the host 21 at the node end 21; the sensor 22 is used to collect the system resource information of the host 22 at the node end 22; the sensor 23 is used to collect the system resource information of the host 23 at the node end 23.
[0018] Each sensor can obtain system data by regularly reading the system parameters of the host at the respective node end. The data to be monitored mainly includes: CPU usage, system memory capacity and usage, system swap area size and usage, disk usage conditions and the status of the application, etc.
[0019] As a preferred method, each sensor can use the sigar development kit of the open source community to collect system resource information, which can conveniently and efficiently collect various status information at the node end. At the same time, using its built-in multiple native code libraries, it can realize the collection of status information on different operating systems, shielding the heterogeneity of the system.
[0020] In addition, each sensor may include: a collection module, a protocol analysis and packaging module, and a sending module. Among them, the acquisition module is used to collect the system resource information of the host computer at the corresponding node end; the protocol analysis and packaging module is used to package the collected system resource information into a communication object conforming to the communication protocol; the sending module is used to periodically package the communication object sent to the appropriate sensor manager.
[0021] The workflow of each sensor’s acquisition module, protocol analysis and packaging module, and sending module is as follows: figure 2 As shown, the acquisition module executes the data collection thread first, and sends the collected data into the data buffer area, and then, the protocol analysis and packaging module executes the performance index packaging thread, that is, the information is packaged into a communication object that conforms to the communication protocol, and finally Then the sending module executes the release thread, and sends the formed communication object to the corresponding sensor manager periodically.
[0022] The sensor managers are located at the domain agent layer, and are used to extract and classify the system resource information sent by the sensors at the nodes in the area under their jurisdiction based on metadata standards, so as to classify the information of each node in the area under their jurisdiction. Monitor and manage the host at the end. like figure 1 As shown, in this embodiment, the sensor manager 1 at the domain proxy layer monitors and manages the sensors 11, 12 and sensors 13 in the area under its jurisdiction, and the sensor manager 2 at the domain proxy layer governs it The sensors 21, 22 and 23 in the area are monitored and managed.
[0023] Each sensor manager manages the status information of all nodes in their respective domains. Socket communication can be used between each sensor and the corresponding sensor manager. Every other collection cycle, each sensor manager will receive information from each sensor in its own domain. The current state information of the host at the node end is collected, so that each sensor manager can display the state information of the members in their respective domains, and at the same time update the state information of the corresponding node end in the local database, and sort out the current latest data set and pass it to The information center server located in the global information layer.
[0024] It should be noted that the local database of each sensor manager can use the built-in JAVA DB database of JDK, which is an open-source small SQL database Derby. The advantage is that it is small in size and can be embedded in the program. It can be easily deployed and run in all systems with JDK6.0 or above installed, and has excellent performance.
[0025] The information center server is at the global information layer, and is used to record the meta-information provided by the sensor manager of each domain agent layer, and monitor and manage each domain agent layer at the same time. For example, the information center server puts the meta-information provided by the sensor manager 1 and the sensor manager 2 into the central database for preservation, and also records the basic information of the nodes contained in the sensor manager 1 and the sensor manager 2 respectively, such as Resource name, resource type, resource address, etc., so as to grasp the global information.
[0026] It should be noted that the central database of the information center server can use MySQL to process larger scale data and faster response speed.
[0027] See image 3 , the working process of the above-mentioned grid resource monitoring system is as follows:
[0028] Firstly, each sensor collects the system resource information of the host at its node end, and encapsulates the collected system resource information based on a predetermined protocol, and then sends the encapsulated protocol object to its respective sensor manager. For example, the sensors 11 , 12 , 13 , 21 , 22 , and 23 respectively collect state information of the hosts 11 , 12 , 13 , 21 , 22 , and 23 , encapsulate protocol objects, and then send them.
[0029] Next, after receiving the system resource information sent by the sensors in each domain agent layer, the sensor manager of each domain agent layer extracts the meta information of the system resource information based on the metadata standard, classifies and saves it to the local database, and at the same time sends the information to the information center Servers send status information within their respective domains.
[0030]Finally, the information center server collects information in each domain and writes it into the central database. For example, the resource management module of the "virtual supermarket" platform collects information in each domain and writes it into the central database.
[0031] It should be noted that since the "virtual supermarket" has the characteristics of cross-domain and cross-platform, the agent-based grid resource monitoring system of the present invention selects JAVA of SUN Company as the programming language, because JAVA is excellent in cross-platform, can It is a good solution to the problem of system heterogeneity. Whether it is Win32 or Linux, this monitoring system can run.
[0032] In addition, in terms of information transmission, the combination of socket and multicast UDP is adopted, and multicast can be used to transmit information conveniently in the LAN without having to set the corresponding address and port like socket. But at the same time, considering that domains composed of different LANs need to use sockets to communicate with each other, corresponding socket communication functions need to be implemented. Therefore, this system has high flexibility, it can be used in the same local area network, and it can also be monitored across domains.
[0033] The communication protocol uses an object-oriented method, that is, all the collected performance indicators are encapsulated in a protocol object, and the domain agent can obtain the performance indicators in the protocol object by setting the corresponding getter function. In this way, the performance indicator data and data analysis are put into the protocol object, and the protocol object is responsible for the generation of protocol data and the analysis of data. At the same time, the object serialization method in JAVA is borrowed to transmit the generated protocol object.
[0034] In summary, the agent-based grid resource monitoring system and method of the present invention provides a set of tools for monitoring the working status of member nodes in each domain of the "virtual supermarket" for the managers of the "virtual supermarket" resource sharing and collaborative service platform. The system and method also construct an agent-based multi-level monitoring environment. In the "virtual supermarket" environment, the functions of collecting, summarizing, sorting, storing and displaying status information for each resource node are realized. Due to the use of status information The local storage of grid resources stores the state information of grid resources in the local database as much as possible, and through the hierarchical management of resource monitoring data, the performance overhead required for users to access monitoring information can be reduced; at the same time, it avoids the failure of individual domains The paralysis of the entire grid system.
[0035] Compared with the prior art, the advantages of the present invention include:
[0036] 1) It can run on the resource sharing and collaborative service platform based on the "virtual supermarket", based on the scheduling module and the information management module.
[0037] 2) Two communication methods, socket and multicast, are used to deal with different deployment environments.
[0038] 3) It is cross-platform and easy to deploy and configure.
[0039] The above-mentioned embodiments only illustrate the principles and functions of the present invention, but are not intended to limit the present invention. Anyone skilled in the art can make modifications to the above-mentioned embodiments without departing from the spirit and scope of the present invention. Therefore, the protection scope of the present invention should be listed in the claims.