Master node, slave node, system and method for mirror image management of distributed container cluster

A container cluster and management system technology, applied in the computer field, can solve problems such as the need to improve reliability and security, and the low efficiency of the image management system, and achieve the effect of improving batch management capabilities, realizing efficient and safe operation, and speeding up the speed.

Active Publication Date: 2020-01-03
NANJING UNIV OF POSTS & TELECOMM
12 Cites 22 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0006] Aiming at the problems of low efficiency, reliability and security of the current container cluster image management sys...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Method used

In the present invention, by obtaining the execution tasks of the master node, the mirror resource is automatically updated periodically and the outdated mirror resource is cleaned up, thereby improving the cluster storage resource utilization rate; from the node, tasks such as pulling, updating, and deleting are obtained from the mirror image of the master node , to pull image resources in advance, thereby shortening the deployment preparation time of deep learning containers, and can be applied to large-scale distributed cluster system architecture.
Node records the cause of failure and generates a message, and the content of the message is the cause of failure, and then the message is serialized into a text file as an email and obtains the email address provided by the operation and maintenance engineer from the environment variable in the node to send email , to inform the operation and maintenance engineer that the current node is faulty and the reason for the failure, so that the operation and maintenance engineer can quickly troubleshoot and fix errors.
The method that described fault message generation and reporting module is carried out is: if the mirror image management module of main node fails to the legitimacy verification legitimacy of request content, then will failure cause record and generate fault message; If database docking is unsuccessful or If the connection is successful but the mirror manager does not have the database read and write permission, the cause of the failure will be recorded and a fault message will be generated. Then serialize the message into a text file as an e-mail and obtain the e-mail address set by the operation and maintenance engineer from the environment variable in the node to send an e-mail to inform the operation and maintenance engineer of the...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Abstract

The invention discloses a distributed container cluster mirror image management master node, slave nodes, a system and a method, wherein the main node comprises a mirror image database which is a distributed database and is used for storing node information of all nodes in the system; the request input module is used for receiving request content including a request target and command execution content; mirror image management module, the verification module is used for communicating the password and verifying whether the request content acquired by the request record input module is legal; ifit is judged that a request target in the request content is a specified single slave node or grouped slave nodes, sending a communication password and command execution content including a pulling operation, an updating operation, a deleting operation and a cleaning operation to IP addresses corresponding to the slave nodes; and receiving the state feedback content sent by the slave node. According to the invention, the flattening management of the container cluster mirror image is realized, the management efficiency of the cluster container mirror image in a distributed system is improved,and the high reliability and safety of the whole cluster are improved.

Application Domain

Technology Topic

Image

  • Master node, slave node, system and method for mirror image management of distributed container cluster
  • Master node, slave node, system and method for mirror image management of distributed container cluster
  • Master node, slave node, system and method for mirror image management of distributed container cluster

Examples

  • Experimental program(4)

Example Embodiment

[0048] Implementation mode one
[0049] figure 1 This embodiment provides a framework diagram of a distributed container cluster management master node. In the system, the master node is the control node, which functions as cluster information storage and control, and generally does not run container-related services. The master node specifically includes:
[0050] Mirror database, request input module and mirror management module;
[0051] In response to the mirroring management module successfully docking the mirroring database and having read and write permissions for the mirroring database, the mirroring management module can perform read and write operations on the mirrored database;
[0052] The mirror database is a distributed database and is used to store node information of all nodes. The node information including the current state of the node, instruction execution content, instruction execution time, and instruction execution status log are stored in the distributed database of the master node, The current state of the node includes node name, node role, node operating system and operating system kernel version, container engine running version on the node, time when the node joins the cluster, and time when the node is updated;
[0053] The request input module is configured to receive request content including a request target and command execution content, where the command execution content includes an execution operation field and an execution mirror list;
[0054] The mirror management module, also called the mirror manager, includes a web server that supports mutual TLS and authentication. It has the authority to read and write to the mirror database running on the master node, and is mainly used to obtain user requests from the request input module. The request is recorded and stored in the database. At the same time, it can send instructions related to container mirroring to the slave node, allowing the slave node to execute the related instructions according to user requirements, and receive the return result from the execution of the command from the slave node and store it in the database. The instructions are increased, Update, delete, and clean up four operation instructions.
[0055] Specifically, the mirror management module is used to generate a public key and a private key for encrypting and verifying the communication password; and to verify whether the requested content obtained by the request record input module is legal, and if the requested content is verified, the requested content is stored in the store. The mirror database; and determine if the request target in the request content is a designated single slave node or a group of slave nodes, query the mirror data to obtain the IP address corresponding to the slave node, and send to these slave nodes according to the corresponding IP address Communication password and command execution content, where the command execution content includes pull operation, update operation, delete operation and cleanup operation;
[0056] The mirror management module is also used to receive the status feedback content sent from the node, and store the received time and status feedback content in the mirror database.
[0057] In the actual operating environment, in order to prevent a single point of failure, the master node of the system can be set to multiple, figure 1 The system shown includes 3 master nodes. The number of slave nodes is configured as required. Each master node is connected to the load balancer in the network, and the slave node only needs to set the IP address of the master node as the load balancer service IP address during the configuration process, and the load balancer forwards the traffic to the back-end master node for reasonable Improve the management ability of the master node in the system. In the test environment, the main node of this system is not less than one, and the function of this system can be realized.
[0058] In this embodiment, the request input module is implemented through a dashboard, which is a graphical interface for managing distributed container cluster mirroring that interacts with users. The user can execute related instructions and operations on a slave node or a group of slave nodes through the dashboard. Then indirectly send related instructions through the mirror manager. It should be noted that the request input module can be implemented by other existing technologies, and is not limited to the embodiment mode, and will not be repeated.
[0059] The process of the embodiment of the distributed container cluster management method applied to the master node corresponding to this embodiment is as follows (such as image 3 Shown):
[0060] Initialization of the master node: Start the mirror manager on the master node. The mirror manager first tries to connect to the distributed database. If the database is successfully connected and the mirror manager has database read and write permissions, the database and data table are created, and the startup event is written to the database , And then the image manager generates a communication password composed of a TLS public key and a private key for encryption authentication;
[0061] The master node starts the dashboard. The dashboard first verifies whether the current node is running the mirror manager and the mirror manager is running normally. If the mirror manager is already running and running normally, it will directly establish a connection with the mirror manager, and turn on blocking to wait for users to be acquired Request, if a user request is received, when the master node first determines whether the request comes from the internal network, if the request comes from an external network, it discards the request and writes the external request event and the time when the request is received into the database; if the request comes from the internal network, The dashboard first verifies whether the fields corresponding to the requested content are legal.
[0062] The legality verification includes: the field integrity of the requested content, the standardization of the requested content, and if the execution time in the requested content is not empty, verify whether the execution time is greater than the current time plus 40 seconds; if there is a list of objects in the requested content, The standardization of the mirror format in the list is checked once;
[0063] If the request fails the verification, the illegal field will be fed back to the user, and the request content and request submission time will be stored in the database; otherwise, no operation will be performed; if the mirror manager is not running or the mirror manager runs abnormally, it will fail Reason to generate a text file and save it on the disk;
[0064] If it is judged that the request is legitimate, the dashboard serializes each field of the requested content into text content in JSON format, and uses the text content as the requested content to initiate a request to the mirror manager in the current node;
[0065] The mirroring manager of the master node first judges the request type. If the request target is a designated slave node or a group of slave nodes, the mirroring manager will issue a cyclic and asynchronous execution command to each slave node in the request target.
[0066] The master node command issuing method includes: the mirror manager of the master node first queries the IP address and communication password corresponding to the slave node in the mirror database, and the master node mirror manager sends an HTTPS command execution request to the slave node through the IP address, where the command execution request The destination address is the IP address of the slave node, the request header carries the communication password of the slave node, and the request content is the specific command execution content, including the execution operation field and the execution mirror list. The entire HTTPS command request is encrypted and encapsulated by the TLS key generated by the command After the command execution request is sent to the slave node, the slave node will execute the operation corresponding to the command execution content. The mirror manager of the master node waits for the slave node to execute the operation and sends the result of the execution command, and sends the execution result to the data mirroring database.
[0067] Master node execution status feedback record: After receiving the reply from the slave node, the mirror manager of the master node stores the receiving time and reply content as the object in the database. If it receives the reply that the task is executed, it will send it to the master node. The dashboard sends a notification that the task is executed, and the dashboard will generate a task execution completion notification in the request result to inform the user that the task execution is complete, and then the master node's dashboard will open again and block waiting for user requests, and the entire cluster enters the next step cycle.
[0068] This embodiment realizes the flat management of container cluster mirroring, improves the management efficiency of cluster container mirroring in a distributed system, accelerates the rate of container operation information flow, and realizes the persistent storage of all operation contents of container mirroring in the cluster, which is convenient Event audit, fault location and post-maintenance; and through the verification of the requested content, the reliability of mirror management is improved, misoperation or illegal operation is avoided, and the effectiveness of the communication connection of the nodes in the working process and the high availability and security of the entire cluster are solved Sex.

Example Embodiment

[0069] Implementation mode two
[0070] In order to achieve the scalability and stability of the system, it is possible to authorize new slave nodes in the system in time when the system needs to add slave nodes or after an unexpected failure occurs in the slave node, and at the same time to increase the security of the system. On the basis of the above implementation, including:
[0071] The mirror management module includes a Web server, which is used to block monitoring and wait for a slave node to apply for joining a cluster request. If a request to join a cluster is received from a node, the mirror management module performs a communication password on the request to join the cluster. Authentication, after passing the authentication, the node name, role, operating system, operating system kernel version, container engine version, request time and communication password added to the cluster request will be written into the mirror database, and the update time of the slave node will be set and written into the mirror The database replies to the cluster of slave nodes a success message of joining the cluster.
[0072] The embodiment of the distributed container cluster management method applied to the master node provided in this embodiment is based on the method provided in the above embodiment, and further includes:
[0073] The master node turns on the web server in the mirror manager to block monitoring, and waits for the request from the node to join the cluster. If the request from the node to join the cluster is received at the current moment, first verify the legitimacy of the request, that is, verify whether the key-value pair in the request is stored locally The key-value pairs are consistent. If they are consistent, write the requested node name, role, operating system, operating system kernel version, container engine version, request time, and communication password to the database, and set the node's update time to the request time. Enter the database, and then reply with the success message of joining the slave node cluster.

Example Embodiment

[0074] Implementation mode three
[0075] In order to improve the efficiency of the batch processing of container images in the cluster, based on the above implementation, the image management module further includes: in response to the completion of the slave node joining the cluster, the image management module uses all slave node information and labels set for it as data The table is stored in the mirror database;
[0076] If the mirror management module determines that the request record input module obtains the request target as the specified label, it queries the slave node list corresponding to the label in the mirror database to obtain the IP address corresponding to the slave node in the list; These slave nodes send command execution content, and the command execution content includes pull operation, update operation, delete operation, and cleanup operation.
[0077] The embodiment of the distributed container cluster management method applied to the master node provided in this embodiment is based on the method provided in the above embodiment, and further includes:
[0078] When all the slave nodes join the cluster through the master node, at this time, the cluster operation and maintenance engineer needs to store the labels attached to the nodes in the cluster to the database as a data table for future task execution (such as: The worker node is set to a label named worker or the test node in the cluster is set to a tester label).
[0079] The mirror manager of the master node judges the request type. If the request target is a specified label, the mirror manager first queries the mirror database to output all the slave nodes whose labels are the specified label of the request target, and then cyclically and asynchronously executes the command to these slave nodes. . The method for issuing the command is the same as in the first embodiment, and the introduction is not repeated.
[0080] In this embodiment, by setting labels for slave nodes, the batch management capability of slave nodes engaged in different transactions in the system is improved, and the efficiency of system mirroring management is further improved.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Similar technology patents

Alternating current-direct current micro-grid system on basis of electricity energy collector

InactiveCN104578166ASimple structure and control processImprove management efficiencySingle network parallel feeding arrangementsSingle ac network with different frequenciesPower gridGrid connection
Owner:STATE GRID CORP OF CHINA +1

Classification and recommendation of technical efficacy words

  • Improve management efficiency
  • Increase speed

Electric cable industry-control winding inserter

InactiveCN101132097AImprove managementImprove management efficiencyConnection formation by deformationAutomatic controlProduction rate
Owner:DALIAN SPECIAL PURPOSE MACHINE TOOL WORKS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products